[bisq-network/bisq] Reduce initial request size (#4233)

Sun Jun 14 20:18:33 UTC 2020

@dmos62 requested changes on this pull request.

Requested changes. Mostly related to clean up, but still significant. I found readability difficult. Would appreciate more helpful names and more intermediary variables.

I wrote up a description of the PR and the problem it's solving. I'll include it below. Some of the things there may not be correct; it represents my ongoing learning about Bisq's P2P (and this PR). @freimair, after corrections, if they're necessary, would you consider including some of this information in an appropriate place in the code? I think future devs will appreciate an explanation of why this datastore design was chosen. I would appreciate feedback if I got something wrong.

------------------------

A datastore is a store of Bisq p2p objects. It is synced between peers-peers and peers-seednodes, 
and this way makes up a distributed database.

I presume that a peer's datastore is synced in this way:
  - a peer/client starts up;
  - it starts a "bulk sync" procedure:
    - gathers up UIDs (or hashes) of the objects it already has in its datastore;
    - uploads those UIDs to a seed node;
    - seed node takes a set difference of the passed UIDs and the UIDs of the objects it has;
    - seed node uploads to the peer/client the objects corresponding to that difference;
  - peer then enters p2p mode, listening for and publishing updates.

The problem is that this "bulk sync" procedure includes the peer sending a large number of UIDS
to the seed node, which can timeout, in case of insufficient network throughput, causing the
syncing to unrecoverably terminate.

That shows two problems:
  - Bisq does not recover from throughput issues;
  - the bulk sync is bandwidth-intensive.

This spec addresses the bandwidth-intensity.

Chronological information in the p2p objects would allow querying it by comparing objects to a
timestamp-like value; however, the p2p objects do not contain chronological information. It is
unclear whether introducing a timestamp would have any negative effects, but it would involve
making sensitive changes.

Freimair's proposal is about using chronologically-indexed datastore "checkpoints" that are 
distributed with releases
  - (and thus identical across clients and seednodes)
  - each checkpoint contains objects generated since the last checkpoint
    - the first checkpoint contains all prior objects
    - this effectively assigns chronological information to objects
      - precision of which, for the purposes of bulk sync, corresponds to frequency of releases;
  - this way a bulk sync can be initiated by identifying
    - the latest checkpoint that the client has
    - and the objects it has stored since then (objects not in any checkpoint known to him)
    - which reduces the bandwidth requirement
      - to the equivalent of uploading UIDs that are newer than the release used
        - because identifying the checkpoint is negligible bandwidth-wise.

Requirements for the checkpoint datastore:
  - initializable from the previous monolithic datastore or from scratch;
  - a "running" checkpoint has to be maintained;
    - which holds objects generated since the last "release" checkpoint;
  - at no point can the datastore be mutated, except via appends;
  - "release" checkpoints must be identical across peers and seednodes.

> +        if (getMap().containsKey(hash))
+            return;
+
+        store.getMap().put(hash, payload);
+        persist();
+    }
+
+    @Override
+    protected PersistableNetworkPayload putIfAbsent(P2PDataStorage.ByteArray hash,
+                                                    PersistableNetworkPayload payload) {
+        // make sure we do not add data that we already have (in a bin of historical data)
+        if (getMap().containsKey(hash))
+            return null;
+
+        PersistableNetworkPayload previous = store.getMap().putIfAbsent(hash, payload);
+        persist();

Reading this is a bit difficult, because `getMap()` is easy to confuse with `store.getMap()`.

`getMap().containsKey(hash) == false` implies `store.getMap().containsKey(hash) == false`, because `getMap()` is a union of `store` and `store`s inside `history`, so L76 here will always insert a new entry. I'd like to point out that these `.put` and `.putIfAbsent` implementations (their returns) have different semantics than in `java.util.Map` (`.put` returns `void`). Same in `MapStoreService`.

>          // We collect the keys of the PersistableNetworkPayload items so we exclude them in our request.
         // PersistedStoragePayload items don't get removed, so we don't have an issue with the case that
         // an object gets removed in between PreliminaryGetDataRequest and the GetUpdatedDataRequest and we would
         // miss that event if we do not load the full set or use some delta handling.
-        Set<byte[]> excludedKeys = this.appendOnlyDataStoreService.getMap().keySet().stream()
-                .map(e -> e.bytes)
-                .collect(Collectors.toSet());
+        Set<byte[]> excludedKeys;
+        if (seedNodeRepository != null && seedNodeRepository.isSeedNode(networkNode.getNodeAddress())) {

Meaning of this predicate not immediately obvious: maybe `var weAreASeedNode = seedNodeRepository != null && seedNodeRepository.isSeedNode(networkNode.getNodeAddress())`?

>          // We collect the keys of the PersistableNetworkPayload items so we exclude them in our request.
         // PersistedStoragePayload items don't get removed, so we don't have an issue with the case that
         // an object gets removed in between PreliminaryGetDataRequest and the GetUpdatedDataRequest and we would
         // miss that event if we do not load the full set or use some delta handling.
-        Set<byte[]> excludedKeys = this.appendOnlyDataStoreService.getMap().keySet().stream()
-                .map(e -> e.bytes)
-                .collect(Collectors.toSet());
+        Set<byte[]> excludedKeys;
+        if (seedNodeRepository != null && seedNodeRepository.isSeedNode(networkNode.getNodeAddress())) {
+            excludedKeys = this.appendOnlyDataStoreService.getMap().keySet().stream()
+                    .map(e -> e.bytes)
+                    .collect(Collectors.toSet());
+        } else {
+            excludedKeys = this.appendOnlyDataStoreService.getMap("since " + Version.VERSION).keySet().stream()
+                    .map(e -> e.bytes)
+                    .collect(Collectors.toSet());
+        }

         Set<byte[]> excludedKeysFromPersistedEntryMap = this.map.keySet()
                 .stream()
                 .map(e -> e.bytes)
                 .collect(Collectors.toSet());

`.keySet().stream().map(e -> e.bytes).collect(Collectors.toSet())` is applied 3 times in this method. Moving it to a method (like `getKeySetInBytes(Map)`) would improve readability.

> @@ -205,24 +214,61 @@ public GetUpdatedDataRequest buildGetUpdatedDataRequest(NodeAddress senderNodeAd
         return new GetUpdatedDataRequest(senderNodeAddress, nonce, this.getKnownPayloadHashes());
     }

+    /**
+     * Create the special key. <br><br>

"Special key" is a problem semantically, in that it doesn't describe how it is special. Better to be concrete: it's the release version string in bytes. That's not my main point here, but I noticed a few phrasings like this so felt that it should be mentioned.

The actual problem is that there's string and byte manipulations in various files in this PR. You can put all this logic in `bisq/common/app/Version.java` under `toP2PDataStorageKey`, `static fromP2PDataStorageKey`, etc. Most places where strings are being passed around, `Version`s should be passed around instead. For example, `Version.history` would do better as `List<Version>`, than `List<String>`.

> @@ -55,6 +55,18 @@ public void readFromResources(String postFix) {
         services.forEach(service -> service.readFromResources(postFix));
     }

+    /**
+     * TODO to be implemented

Probably forgot to remove this.

> + * <li>shoveling around data in case Bisq gets updated to a new version</li>
+ * <li>takes care of setting up a fresh Bisq install</li>
+ * <li>makes sure that historical data cannot be altered easily</li>
+ * <li>adding data to the store will only amend the live store</li>
+ * <li>takes care of keeping the legacy API functional</li>
+ * <li>adds the feature of filtering object queries by Bisq release</li>
+ * </ul>
+ * </p>
+ *
+ * <h3>Further reading</h3><p><ul>
+ *     <li><a href="https://github.com/bisq-network/projects/issues/25">project description</a></li>
+ * </ul></p>
+ */
+ at Slf4j
+public abstract class SplitStoreService<T extends SplitStore> extends MapStoreService<T, PersistableNetworkPayload> {
+    protected HashMap<String, SplitStore> history;

`HashMap<Version, SplitStore>` would be better.

> +    public Map<P2PDataStorage.ByteArray, PersistableNetworkPayload> getMap(String filter) {
+        HashMap<P2PDataStorage.ByteArray, PersistableNetworkPayload> result = new HashMap<>();
+        result.putAll(store.getMap());
+
+        // TODO do a proper language, possibly classes
+        if (filter.startsWith("since ")) {
+            String finalFilter = filter.replace("since ", "");
+            if (!finalFilter.equals(Version.VERSION)) {
+                history.entrySet().stream()
+                        .filter(entry -> parseSpecialKey(entry.getKey()) > parseSpecialKey(finalFilter))
+                        .forEach(entry -> result.putAll(entry.getValue().getMap()));
+            }
+        }
+
+        return result;
+    }

`getMap(String)` is cryptic. `getObjectsSince(Version)` would be much better. Notice that gets rid of the language/DSL. Doing a language is not necessary, because currently there's only one usage for these version keys here: getting objects from newer releases than the one specified.

`parseSpecialKey(finalFilter)` is called a lot, could be cached.

> +            log.error("Could not copy resourceFile {} to {}.\n{}", resourceFileName, destinationFile.getAbsolutePath(), e.getMessage());
+            e.printStackTrace();
+        }
+
+        // split
+        // - get all
+        SplitStore historicalStore = readStore(destinationFile.getName());
+        // - subtract all that is in resource files
+        store.getMap().keySet().removeAll(historicalStore.getMap().keySet());
+
+        // - create new file with leftovers
+        storage.initAndGetPersisted(store, 0);
+        storage.queueUpForSave();
+
+        return historicalStore;
+    }

Why are we copying historical datastore checkpoints/splits out of resources? Because that's what the live-store-only solution we currently use does? For troubleshooting purposes? Seems unnecessary, though I may be missing something. I presume we keep the live store in the working directory, because we can't mutate resource files (is that even true?). 

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/bisq-network/bisq/pull/4233#pullrequestreview-430234666
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.bisq.network/pipermail/bisq-github/attachments/20200614/84232a5f/attachment-0001.html>