[bisq-network/bisq] Improve DAO state management (Issue #5783)

Wed Oct 27 12:05:58 CEST 2021

The DAO state is getting larger with each new block and is a major issue for memory consumption of Bisq.
Currently it is about 150 MB, where 135 MB are the blocks. We load it at startup which takes about 2-4 sec on my machine and then after parsing is complete we write snapshots to disk which also takes a few seconds. Persistence happens on background threads so it is not a major problem but still quite slow.

I was wondering what we could do to improve that situation.
Best would be if we could avoid to load 150 MB into memory but I fear that is not feasible without either a huge and risky effort to optimize and customize the data access or try to use some DB but I am pretty sure that would degrade performance a lot as we need most of the data for different forms of access (the large chunk of data are transactions and we need to look up those for various properties in the DAO context). In Bitcoin it is easier as there the spendability is the only real aspect to consider, so pruning of data which is not relevant for furture use cases can be done easier. In Bisq the DAO has many different use cases where old data is still relevant, so pruning is likely not really possible. The only way I see is to create all kind of data views (lookup maps) for the various use cases we have. E.g. If we need to iterate all txs for finding potential locked outputs we could just make a lookup map where the txOutputType we are interested in is used as key and we hold only the few relevant txs there. If all use cases could be optimized that way we could release the blocks/txs and only keep the relevant views, which likely will be much smaller. That would be one approach but I am not 100% sure if it would really lead to less data and it comes with considerable effort and risks for the DAO consensus.

But the 150 MB in memory are not the only concern. At snapshot creation we have to keep a copy of that in memory. Here we could optimize with much less risks and less effort.
We could extract the blocks from the daoState into a dedicated data store and split that up either automatically in buckets of e.g. 1000 blocks or use a similar versioning model as we use for trade statistics (historical data store). The automatically managed approach would be probably better as it would not require work from the release manager. As blocks are immutable we could treat all old buckets immutable closed data structures as well and therefor wont need to persist them. They are only loaded for getting the data and concatenated with all the other buckets to form the full BSQ block chain. We only persist the current one which will be much faster as data size will be a few MB not 150 MB.
The other DAO state is about 15 MB so that is also ok for fast persistence.
There is still a smaller issue with re-org. Say we just moved to a new bucket with a new block and then there is a 3 block deep re-org so it changes our blocks in the previous bucket which we considered as immutable/closed. But that is a minor issue and there are ways to deal with that, like deleting the previous bucket and resync. We do a resync from the last snapshot also now in case of reorgs. We do not need to support arbitrarily deep re-orgs as realistically reorgs are never deeper than 3-4 blocks. 

So I think that approach is feasible to implement and comes with little risks for DAO consensus issues.

Any other ideas? Feedback?

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/bisq-network/bisq/issues/5783
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.bisq.network/pipermail/bisq-github/attachments/20211027/5976cfca/attachment.htm>