It turned out that the "sync up" procedure of repeatedly generating a
block and waiting for a notification once with timeout is too naive in
its current form, as the following scenario could happen:
- generate block A
- receive notification, timeout happens -> repeat procedure
- generate block B
- node publishes block A notification
- receive notification, we receive the one caused by block A
-> sync-up procedure is completed
- node publishes block B
- the actual test starts
- on the first notification reception, one caused by block B is received,
rather than the one actually caused by test code, leading to failure
This change ensures that after each test block generation, we wait for
the notification that is actually caused by that block and ignore others
from possibly earlier blocks.
Co-authored-by: Jon Atack <jon@atack.com>
Speeds up the zmq test roughly by a factor of 2x (~20 sec. instead of
~40 sec.) and also avoids timeouts on the synchronization methods
(sync_mempool() / sync_blocks()) that happened with a slight chance.
This is due to the fact that there is no upper bound on the trickle
relay time, so even the default of 60s is sometimes too low. Fixed by
enabling immediate tx relay on node1.
After connecting the subscriber sockets to the node, there is no
guarantee that the node's zmq publisher interfaces are ready yet, which
means that potentially the first expected notification messages could
get lost and the test fails. Currently this is handled by just waiting
for a short period of time (200ms), which works most of the time but is
still problematic, as in some rare cases the setup time takes much
longer, even in the range of multiple seconds.
The solution in this commit approaches the problem by using a more
robust method of syncing up, originally proposed by instagibbs:
1. Generate a block on the node
2. Try to receive a notification on all subscribers
3. If all subscribers get a message within the timeout (1 second),
we are done, otherwise repeat starting from step 1
The ZMQSubscriber reception methods currently assert that the first
received publisher message has a sequence number of zero. In order to
fix the current test flakiness via "syncing up" to nodes in the setup
phase, we have to cope with the situation that messages get lost and the
first actual received message has a sequence number larger than zero.
-BEGIN VERIFY SCRIPT-
# max-depth=0 excludes test/functional/test_framework/...
FILES=$(git grep -l --max-depth 0 "connect_nodes" test/functional)
# Replace (dis)?connect_nodes(self.nodes[a], b) with self.(dis)?connect_nodes(a, b)
sed -i 's/\b\(dis\)\?connect_nodes(self\.nodes\[\(.*\)\]/self.\1connect_nodes(\2/g' $FILES
# Remove imports in the middle of a line
sed -i 's/\(dis\)\?connect_nodes, //g' $FILES
sed -i 's/, \(dis\)\?connect_nodes//g' $FILES
# Remove imports on a line by themselves
sed -i '/^\s*\(dis\)\?connect_nodes,\?$/d' $FILES
sed -i '/^from test_framework\.util import connect_nodes$/d' $FILES
-END VERIFY SCRIPT-
Co-authored-by: Elliott Jin <elliott.jin@gmail.com>
a4edb168b6 ZMQ: add options to configure outbound message high water mark, aka SNDHWM (mruddy)
Pull request description:
ZMQ: add options to configure outbound message high water mark, aka SNDHWM
This is my attempt at https://github.com/bitcoin/bitcoin/pull/13315
Tree-SHA512: a4cc3bcf179776899261a97c8c4f31f35d1d8950fd71a09a79c5c064879b38e600b26824c89c4091d941502ed5b0255390882f7d44baf9e6dc49d685a86e8edb
This requires a small changes to a few tests, but means that
deterministic addresses will always be imported (unless setup_nodes
behaviour is explicitly overridden).
faa4043c66 qa: Run more tests with wallet disabled (MarcoFalke)
Pull request description:
Instead of skipping the whole test, only skip the wallet specific section of a test if the wallet is not compiled in. This is mostly an indentation change, so can be reviewed with `--ignore-all-space`.
Tree-SHA512: 5941a8b6b00dca5cf9438c5f6f010ba812115188a69e427d7ade4c1ab8cfe7a57c73daf52c66235dbb24b1cd9ab7c7a17c49bc23d931e041b605d79116a71f66