Downloading all the crates on crates.io

栏目: IT技术 · 发布时间: 4年前

内容简介:There are a lot of reasons you might want to down­load all the crates ever uploaded toThe team behind crates.io receives a lot of support request asking what’s the best and least impact­ful way to do this, so here is a little guide on how to do that!crates

There are a lot of reasons you might want to down­load all the crates ever uploaded to crates.io , Rust’s pack­age registry: code analy­sis across the whole public ecosys­tem, host­ing a mirror for your compa­ny, or count­less other ideas and projects.

The team behind crates.io receives a lot of support request asking what’s the best and least impact­ful way to do this, so here is a little guide on how to do that!

Getting a list of all the crates

crates.io offers multi­ple way to inter­act with its data: the crates.io-index GitHub repos­i­to­ry, exper­i­men­tal daily data­base dumps and the crates.io API.

The way I recom­mend to get the list of all the crates is to rely on the index: the exper­i­men­tal data­base dumps are more heavy­weight and are only updated daily, while usage of the API is governed by the crawlers policy (lim­it­ing you to one API call per second). If you absolutely need to use the API please talk with us by email­ing help@crates.io , and we’ll figure out a solution.

The index is a git repository , and the format of its content is defined by RFC 2141 . There are crates such as crates-index that allow you to easily query its contents, and I recom­mend using them when­ever possible.

Downloading the packages

The best way to down­load the pack­ages is to fetch them directly from our CDN. Compared to call­ing the crates.io API, the CDN does not have rate limits and is faster (as the API redi­rects you to the CDN after updat­ing the down­load coun­t). The CDN URLs follow this pattern:

https://static.crates.io/crates/{name}/{name}-{version}.crate

For exam­ple, here is the link to down­load Serde 1.0.0 . Pack­ages are tar.gz files.

If you want to ensure the contents of the CDN were not tampered with you can verify the SHA256 check­sum of the file you down­loaded by compar­ing it with the cksum field in the index.

Keeping your local copy up to date

The best way to keep your local copy up to date is to fetch a fresh list of crates avail­able on crates.io and check if all of them are present in the local system, down­load­ing the ones you’re miss­ing. I person­ally recom­mend this approach as it’s less error-prone, and it heals your copy auto­mat­i­cally if for what­ever reason some of the changes are lost during a previ­ous update.

Another inter­est­ing approach you could imple­ment is to get the differ­ence since the last update of the index with git diff , pars­ing its output to get the list of crates that were added. There are also third-­party crates such as crates-index-diff that auto­mate this process for you. This approach is more frag­ile and error-prone, but it might be the only sensi­ble solu­tion if check­ing whether you down­loaded a crate or not is slow or expensive.

Common issues to be aware of

While the basics of down­load­ing the contents of crates.io are simple, there are a couple of issues to be aware of when imple­ment­ing such tooling:

  • The crates.io team strives to keep the registry as immutable as possi­ble, but we can’t always keep that promise. The tech­nol­ogy world does­n’t exist in a bubble, and there are laws every­one needs to abide to. Occa­sion­ally we receive take­down requests due to trade­mark or copy­right issues, and we have to remove the crates both from the registry and the CDN: your tool­ing should handle exist­ing crates disappearing.

  • To reduce the down­load size for cargo users we regu­larly squash the index repos­i­tory into a single commit, and start the git history from scratch. The previ­ous history is kept in a sepa­rate branch. To account for this we recom­mend running these commands to update the index:

    git fetch
    git reset --hard origin/master

以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

数据结构与算法(Java语言版)

数据结构与算法(Java语言版)

(美) Adam Drozdek著;周翔 / 机械工业出版社 / 2003-07-01 / 49.50元

数据结构与算法:Java语言版,ISBN:9787111119029,作者:(美)Adam Drozdek著;周翔[等]译;周翔译一起来看看 《数据结构与算法(Java语言版)》 这本书的介绍吧!

HTML 压缩/解压工具
HTML 压缩/解压工具

在线压缩/解压 HTML 代码

XML 在线格式化
XML 在线格式化

在线 XML 格式化压缩工具

HEX CMYK 转换工具
HEX CMYK 转换工具

HEX CMYK 互转工具