Pull additional nodes from virtu's crawler. Data includes sufficient
Onion and I2P nodes to align the uptime requirements for these networks
to that of clearnet nodes (i.e., 50%). Data also includes more than
three times the number of CJDNS nodes currently hardcoded into
nodes_main_manual.txt, so hardcoded nodes becomes obsolete.
The crawlers are not guaranteed to output nodes in a random order, so
shuffle the ips list after parsing to break any biasing that may be
caused by the output order.
Update the user agent regex to match all 3 digits of the version number,
not just the first 2 digits.
Also updates it to include 24.2, 25.2, 26.1, 27.0, 27.1, 27.99, 28.0 and
28.99.
6abe772a17 contrib: Add asmap-tool (Fabian Jahr)
Pull request description:
This adds `asmap.py` and `asmap-tool.py` from sipa's `nextgen` branch: https://github.com/sipa/asmap/tree/nextgen
The motivation is that we should maintain the tooling for de- and encoding asmap files within the bitcoin core repository because it is not possible to use an asmap file that is not encoded.
We already had an earlier version of `asmap.py` within the seeds contrib tools. The newer version only had a small amount of changes and is still compatible, so the old version is removed from contrib/seeds and the new version is made available to `makeseeds.py`.
ACKs for top commit:
virtu:
ACK [6abe772](6abe772a17)
0xB10C:
ACK 6abe772a17
achow101:
ACK 6abe772a17
brunoerg:
ACK 6abe772a17
Tree-SHA512: cc2a82ffa4eb46fa0ce4ca769dd82f8d0d2f37fc3652aa748eeb060e1142f9da4035008fe89433e2fd524a4dc153b7b9c085748944b49137b37009b0c0be8afb
Since Python 3.9, type hinting has become a little less awkward, as for
collection types one doesn't need to import the corresponding
capitalized types (`Dict`, `List`, `Set`, `Tuple`, ...) anymore, but can
use the built-in types directly. [1] [2]
This commit applies the replacement for all Python scripts (i.e. in the
contrib and test folders) for the basic types:
- typing.Dict -> dict
- typing.List -> list
- typing.Set -> set
- typing.Tuple -> tuple
[1] https://docs.python.org/3.9/whatsnew/3.9.html#type-hinting-generics-in-standard-collections
[2] https://peps.python.org/pep-0585/#implementation for a list of type
3cc989da5c Fix checking bad dns seeds without casting (Yusuf Sahin HAMZA)
Pull request description:
- Since seed lines comes with `str` type, comparing `good` column directly with **0** (`int` type) in the if statement was not working at all. This is fixed by casting `int` type to the values in the `good` column of seeds text file.
- Lines that starts with comment in the seeds text file are now ignored.
- If statement for checking bad seeds are moved to the top of the `parseline` function as if a seed is bad; there is no point of going forward from there.
Since this bug-fix eliminates bad seeds over **550k** in the first place, in my case; particular job for parsing all seeds speed is up by **600%** and whole script's speed is up by **%30**.
Note that **stats** in the terminal are not going to include bad seeds after this fix, which would be the same if this bug were never there before.
ACKs for top commit:
achow101:
ACK 3cc989da5c
jonatack:
ACK 3cc989da5c
Tree-SHA512: 13c82681de4d72de07293f0b7f09721ad8514a2ad99b0584d1c94fa5f2818821df2000944f9514d6a222a5dccc82856d16c8c05aa36d905cfa7d4610c629fd38
Since seed lines comes with 'str' type, comparing it directly with 0
('int' type) in the if statement was not working at all. This is fixed
by casting 'int' type to the values in the 'good' column of seeds text file.
Lines that starts with comment in the seeds text file are now ignored.
If statement for checking bad seeds are moved to the top of the 'parseline'
function as if seed is bad, there is no point of going forward from there.
Add an argument `-a` to provide a asmap file to do the IP to ASN
lookups.
This speeds up the script greatly, and makes the output deterministic.
Also removes the dependency on `dns.lookup`.
I've annotated the output with ASxxxx comments to provide a way to
verify the functionality.
For now I've added instructions in README.md to download and use the
`demo.map` from the asmap repository. When we have some other mechanism
for distributing asmap files we could switch to that.
This continues #24824. I've removed all the fallbacks and extra
complexity, as everyone will be using the same instructions anyway.
Co-authored-by: Pieter Wuille <pieter.wuille@gmail.com>
Co-authored-by: James O'Beirne <james.obeirne@pm.me>
Co-authored-by: russeree <reese.russell@ymail.com>
I have some qualms with maintaining a suspicious hosts list as part as
the repository. But also, it's stale and irrelevant. I've checked the
entire list and none of them is connectable. Only one still appars in
`nodes_main.txt` but with low uptime and an old subversion string so it
wouldn't be picked in the first place.
Documentation:
- Use https URL for bitcoin.sipa.be (http sends a redirect, fooling
curl).
- Add explicit step to add manual seeds.
Code:
- Change PATTERN_ONION to v3 (effectively means that the no onion hosts
are delivered).
- Add versions to PATTERN_AGENT filter.
- Print specific message on resolve exception.
961f148cb1 doc: update contrib/seeds/README dnspython installation info (Jon Atack)
dd7b5f46d8 script: fix deprecation warning in makeseeds.py (Jon Atack)
Pull request description:
Seen while reviewing #20237.
1. Fix a deprecation warning in `contrib/seeds/makeseeds.py`
```
makeseeds.py:139: DeprecationWarning: please use dns.resolver.resolve() instead
asn = int([x.to_text() for x in dns.resolver.query('.'.join(
```
- Per https://dnspython.readthedocs.io/en/latest/whatsnew.html, `dns.resolver.query()` was deprecated in `dnspython` version 2.0.0.
- See https://dnspython.readthedocs.io/en/latest/resolver-class.html for more info on the resolver class.
2. Update the `dnspython` dependency installation instructions in `contrib/seeds/README`
- The markdown rendering can be seen here: https://github.com/jonatack/bitcoin/tree/contrib-seeds-fixups/contrib/seeds
ACKs for top commit:
laanwj:
code review ACK 961f148cb1
Tree-SHA512: f9c4f318a1a0d35b8de147d24b72c534a1f58eece31e7cfa00b4149a63b6a618d8ca0312f52fd8056f3c645cf2ee68574ca02319fddffdad919a70cd33395d33
makeseeds.py:139: DeprecationWarning: please use dns.resolver.resolve() instead
asn = int([x.to_text() for x in dns.resolver.query('.'.join(
per https://dnspython.readthedocs.io/en/latest/whatsnew.html
dns.resolver.query() was deprecated in dnspython version 2.0.0
Update hardcoded seeds from seeds_emzy.txt seeds_lukejr.txt
seeds_sipa.txt seeds_sjors.txt, according to release process.
Output from makeseeds.py:
```
IPv4 IPv6 Onion Pass
1364173 244127 2454 Initial
1364173 244127 2454 Skip entries with invalid address
1129552 213117 2345 After removing duplicates
1129548 213117 2345 Skip entries from suspicious hosts
338216 191944 2249 Enforce minimal number of blocks
336851 188993 2189 Require service bit 1
6998 1520 150 Require minimum uptime
5682 1290 89 Require a known and recent user agent
5622 1279 89 Filter out hosts with multiple bitcoin ports
512 146 89 Look up ASNs and limit results per ASN and per net
```
e1c582cbaa contrib: makeseeds: Read suspicious hosts from a file instead of hardcoding (Sanjay K)
Pull request description:
referring to: https://github.com/bitcoin/bitcoin/issues/17020
good first issue: reading SUSPICIOUS_HOSTS from a file.
I haven't changed the base hosts that were included in the original source, just made it readable from a file.
ACKs for top commit:
practicalswift:
ACK e1c582cbaa -- diff looks correct
Tree-SHA512: 18684abc1c02cf52d63f6f6ecd98df01a9574a7c470524c37e152296504e2e3ffbabd6f3208214b62031512aeb809a6d37446af82c9f480ff14ce4c42c98e7c2
- Change regular expression to cover recent versions, as well as
subversions with custom uacomment, and improve readability.
- Vary uptime requirements per network (onions are allowed to have less
uptime, to make sure we get enough of them)
- Add deduplication step (to allow simple concatentation of multiple seeds files).
- Log of number of nodes (per network) after every step.
Allow for non-8333 nodes to appear in the internal seeds. This will
allow bitcoind to bypas a filter on 8333. This also makes it possible to
use the same tool for e.g. testnet.
As hosts with multiple nodes per IP are likely abusive, add a filter to
remove these (the ASN check will take care of them for IPv4, but not
IPv6 or onion).
- Moved all seed related scripts to contrib/seeds for consistency
- Updated `makeseeds.py` to handle IPv6 and onions, fix regular
expression for recent Bitcoin Core versions
- Fixed a bug in `generate-seeds.py` with regard to IPv6 parsing