Scalable org file synchronization solution

In my previous post, I’ve somewhat found a way to synchronize my org files. I was pretty happy, I was having just around six org pages, coping them to a webdav server was not a huge hustle :P, util I started to use org-roam. Suddenly I have to create org file for every new zettle, the performance quickly started to tank. This is probably the beauty and curse of using open source solutions — you have full control of your work and you have to solve problems you created yourself.

An idea started to form in my brain (Maybe I should write a new zettle, lol). It is surely a waste of bandwidth to copy all the files which aren’t changing, I merely need to copy those actually changed. So in my head, there are two seperate problems:

  1. I need to record the if any changes made to a file, which means checksum.
  2. I need compare the checksums between local and remote for coping

Gladly it is possible to do both with only functions from emacs. For solving the first problem, means I will need to hash the file whenever I save:

  (defun org-msync-after-save-hook ()
    (let ((fname (org-msync-local-entry (buffer-file-name (current-buffer))))
    (chksum (secure-hash 'sha1 (current-buffer))))
   (puthash fname chksum org-msync-checksums)
   )
 )

This little snippet gets the sha1 checksum every time we save and push it to a hash table org-msync-checksums. The key is the name of the file and value here is checksum. At some point, we would want to store it somewhere on a disk, we do it through another function org-msync-flush-chksums.

  (defun org-msync-flush-chksums ()
  "flush our checksums to the disk"
  (when (> (hash-table-count org-msync-checksums) 0)
    ;;1. get hash table from json
    ;;2. push hash from org-msync-checksums to this json obj
    ;;3. purge the org-msync-checksums
    ;;4. encode this hash table to
    (let* ((json-local (org-msync-local-json))
	   (sums (org-msync-get-chksums json-local)))
      (maphash (lambda (k v) (puthash k v sums))
	       org-msync-checksums)
      (clrhash org-msync-checksums)
      (org-msync-write-chksums json-local sums)
      (message "flushed org checksums"))))	  

This function runs on auto-save-hook so we don’t overdo it. Emacs’ Json implementation offers us a free beer here, we can load a json file into a hash table and vice-versa. This saves me a lot trouble and offers good performance.

At last, I have two commands org-msync-push and org-msync-pull for explicit sychronization. The function is rather long but idea is simple,

  1. When pushing, compare every entry in my local hash to the remote, coping when not equal.
  2. When pulling, compare every entry in the remote hash to the local, coping when not equal.

The result is a small elisp module: org-msync.el. Enjoy :P

comments powered by Disqus