Rust crates: asn-db and asn-tools

栏目: IT技术 · 发布时间: 4年前

内容简介:Webmaster's job is to protect the website from malicious traffic. One of the practices serving this purpose is the collection and analysis of the website'sI have developedNot all website visitors are people interested in its content. Websites receive a str
Rust crates: asn-db and asn-tools

Webmaster's job is to protect the website from malicious traffic. One of the practices serving this purpose is the collection and analysis of the website's HTTP access logs .

I have developed Rust library and command-line tools to help discern malicious traffic from actual visitors.

Malicious traffic

Not all website visitors are people interested in its content. Websites receive a stream of content requests coming from web crawler programs called "bots".

Not all bots are welcome to crawl the website. Some are miss-implemented and would not respect rules defined in robots.txt or maximum request rates signalled by 429 responses code . Bad actors can program bots also for malicious purpose. This can include cloning (where someone hosts a copy of the website for phishing or ad revenue) or blocking access to the website with a denial-of-service attack .

This bot traffic can lead to high server resource utilization and, in the end, to slower performance for the visitors.

Autonomous Systems number database

One way I keep an eye on the bot traffic is by use of information from the Autonomous Systems number (ASN) database. This database contains IP network ranges (prefixes) and their assigned ASN along with the name of the operator and country. This data reflects ownership of different blocks of IP address space.

IPtoASN website maintained by Frank Denis publishes recent ASN database files in tab-separated values (TSV) format.

Rust crates

I have written library and command-line tools to perform IP lookups in the ASN database file obtained from IPtoASN .

The asn-db crate is a library that can read and index the TSV ANS database file for fast lookups of networks containing a given IP address. The asn-tools crate is a set of command-line tools to update and query local copy of the index.

Crate: asn-db

In my work, I use the asn-db library as part of an HTTP access log processing flow. With every request sent to the website, a query with the client's IP address runs against the ASN database to obtain a matching network prefix record. Network prefix and ownership data are then stored with the IP address and other request and response data in the access log database.

Later I can aggregate and visualize access information for each source subnet and network operator. With this information, I can often tell if a particular set of requests is coming from a user network (like mobile or broadband operator) or virtual server hosting company, making them likely to be originating from bots. Having ASN data stored in the HTTP access log database allows me to further drill down on sources and character of the traffic, ensuring any potential website access blocking action based on the client's IP addresses won't affect real visitors of the website and only will address bot traffic.

Crate: asn-tools

The asn-tools crate contains two command-line programs that use the asn-db library. Run the asn-update program to download the TSV file from the IPtoASN website and index it to a locally stored file. Then use asn-lookup to load and query the local index for information.

The asn-lookup tool can take tabular data (CSV) from standard input or arguments containing the list of IP addresses and produce resolved ASN database records with matching networks in various formats.

$ asn-lookup 1.1.1.1 8.8.8.8 9.9.9.9
Network    Country AS Number Owner                            Matched IPs
1.1.1.0/24 US      13335     CLOUDFLARENET - Cloudflare, Inc. 1.1.1.1
8.8.8.0/24 US      15169     GOOGLE - Google LLC              8.8.8.8
9.9.9.0/24 US      19281     QUAD9-AS-1 - Quad9               9.9.9.9

Managing bot traffic

Rust crates: asn-db and asn-tools

After I have identified IP addresses of badly behaving bots, I use the asn-lookup tool to get the list of networks these bots are operating from. Then I use the list as a base to form an access control list (ACL) that will block access to the website. This way, I can block the whole ranges of IP addresses that malicious traffic is coming from, keeping my ACLs short and blocking bots even if they keep changing IPs within their networks. I use this method as the last layer of defence when bot traffic is nuanced and automatic per IP throttling is not sufficient.

I use Varnish HTTP reverse-proxy and cache to front the back-end servers of the website. I manage its configuration with the Puppet configuration management system. To make my life easier, I have added a puppet output format to the asn-lookup command-line tool, so I can use the output generated by it directly as-is in my configuration manifests.

$ asn-lookup -o puppet 105.0.0.123 54.248.1.1 66.249.64.1
'54.248.0.0/13', # US 16509 AMAZON-02 - Amazon.com, Inc. (54.248.1.1)
'66.249.64.0/20', # US 15169 GOOGLE - Google LLC (66.249.64.1)
'105.0.0.0/15', # ZA 37168 CELL-C (105.0.0.123)

Code and installation

These tools and the library are released under the MIT licence and are free to use. The ASN database is released under Public Domain Dedication on the IPtoASN website maintained by Frank Denis.

For information on installation and detailed usage, please go to the asn-tools source repository and asn-db source repository (also available on my GitHub profile ).


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

网络心理学

网络心理学

玛丽•艾肯 (Mary Aiken) / 中信出版社 / 2018-8-1 / CNY 58.00

《五十度灰》如何利用恋物心理,成为全球仅次于《圣经》的畅销读物? 为什么相对于亲朋好友,你更愿意向网络陌生人敞开心扉? 上网时总感觉时间飞逝,原来是网络的时间扭曲效应? 网络游戏中埋伏了哪些“上瘾”机关,暗中操控着你的行为? 为什么科技越发达,我们就越怕死? ...... 网络空间是一个巨大的兔子洞,里面集合了新奇、刺激、喜悦、痛苦、不安等各种元素。在日复一日的......一起来看看 《网络心理学》 这本书的介绍吧!

JSON 在线解析
JSON 在线解析

在线 JSON 格式化工具

RGB HSV 转换
RGB HSV 转换

RGB HSV 互转工具

HSV CMYK 转换工具
HSV CMYK 转换工具

HSV CMYK互换工具