内容简介:Arkency blog has undergone several improvements over recent weeks. One of such changes was openingFor years the blog has been driven byChoosing Github as a backend for our posts was no-brainer. Developers are familiar with it. It has quite a nice integrate
Arkency blog has undergone several improvements over recent weeks. One of such changes was opening the source of blog articles . We’ve have concluded that having posts in the open would shorten the feedback loop and allow our readers to collaborate and make the articles better for all.
Nanoc + Github
For years the blog has been driven by nanoc , which is a static-site generator. You put a bunch of markdown files in, drop a layout and on the other side out of it comes the HTML. Let’s call this magic “compilation”. One of nanoc prominent features is data sources . With it one could render content not only from a local filesystem. Given appropriate adapter posts, pages or other data items can be fetched from 3rd party API. Like SQL database. Or Github!
Choosing Github as a backend for our posts was no-brainer. Developers are familiar with it. It has quite a nice integrated web editor with Markdown preview — which gives in-place editing. Pull requests create the space for discussion. Last but not least there is octokit gem for API interaction, taking much of the implementation burden out of our shoulders.
An initial data adapter looked like this to fetch articles looked like this:
class Source < Nanoc::DataSource identifier :github def items client = Octokit::Client.new(access_token: ENV['GITHUB_TOKEN']) client .contents(ENV['GITHUB_REPO']) .select { |item| item.end_with?(".md") } .map { |item| client.contents(ENV['GITHUB_REPO'], path: item[:path]) } .map { |item| new_item(item[:content], item, Nanoc::Identifier.new(item[:path])) } end end
This code:
- gets a list of files in repository
- filters it by extension to only let markdowns stay
- gets content of each markdown file
- transforms it into a nanoc item object
Good enough for a quick spike and exploration of the problem. Becomes problematic as soon as you start using it for real . Can you spot the problems?
Source data improved
For a repository with 100 markdown files we will have to make 100 + 1 HTTP requests in order to retrieve the content
- it takes time and becomes annoying when you’re in the change-layout-recompile-content cycle of the work on the site
- there is an API request limit per hour (slightly bigger when using token but still present)
Making those requests parallel will only make the process of hitting request quota faster. Something has to be done to limit number of requests that are needed.
Luckily enough octokit gem used faraday library for HTTP interaction and some kind souls documented how one could leverage faraday-http-cache middleware.
class Source < Nanoc::DataSource identifier :github def up stack = Faraday::RackBuilder.new do |builder| builder.use Faraday::HttpCache, serializer: Marshal, shared_cache: false builder.use Faraday::Request::Retry, exceptions: [Octokit::ServerError] builder.use Octokit::Middleware::FollowRedirects builder.use Octokit::Response::RaiseError builder.use Octokit::Response::FeedParser builder.adapter Faraday.default_adapter end Octokit.middleware = stack end def items repository_items.map do |item| identifier = Nanoc::Identifier.new("/#{item[:name]}") metadata, data = decode(item[:content]) new_item(data, metadata, identifier, checksum_data: item[:sha]) end end private def repository_items pool = Concurrent::FixedThreadPool.new(10) items = Concurrent::Array.new client .contents(repository, path: path) .select { |item| item[:type] == "file" } .each { |item| pool.post { items << client.contents(repository, path: item[:path]) } } pool.shutdown pool.wait_for_termination items rescue Octokit::NotFound => exc [] end def client Octokit::Client.new(access_token: access_token) end def repository # ... end def path # ... end def access_token # ... end def decode(content) # ... end end
Notice two main additions here:
-
the
up
method, used by nanoc when spinning the data source, which introduces cache middleware -
Concurrent::FixedThreadPool
from concurrent-ruby gem for concurrent requests in multiple threads
If only that cache worked… Faraday ships with in-memory cache, which is useless for the flow of work one has with nanoc. We’d very much like to persist the cache across runs of the compile process. Documentation indeed shows how one could switch cache backend to one from Rails but that is not helpful advice in nanoc context either. You probably wouldn’t like to start Redis or Memcache instance just to compile a bunch of HTML!
Time to roll-up sleeves again. Knowing what API is expected, we can build file-based cache backend. And there little-known standard library gem we could use to free ourselves of reimplementing the basics again. So much for standing on the shoulders of giants again.
Enter PStore
PStore is a file based persistence mechanism based on a Hash. We can store Ruby objects — they’re serialized with Marshal before being dumped on disk. It supports transactional behaviour and can be madethread safe. Sounds perfect for the job!
class Cache def initialize(cache_dir) @store = PStore.new(File.join(cache_dir, "nanoc-github.store"), true) end def write(name, value, options = nil) store.transaction { store[name] = value } end def read(name, options = nil) store.transaction(true) { store[name] } end def delete(name, options = nil) store.transaction { store.delete(name) } end private attr_reader :store end
In the end that cache store turned out to be merely a wrapper on pstore. How convenient! Thread safety is achieved here by using Mutex internaly around transaction
block.
class Source < Nanoc::DataSource identifier :github def up stack = Faraday::RackBuilder.new do |builder| builder.use Faraday::HttpCache, serializer: Marshal, shared_cache: false, store: Cache.new(tmp_dir) # ... end Octokit.middleware = stack end # ... end
With persistent cache store plugged into Faraday we can now reap benefits of cached responses. Subsequent requests to Github API are skipped. Responses are being served directly from local files. That is, as long as the cache stays fresh..
Cache validity can be controlled by several HTTP headers
. In case of Github API it is the Cache-Control: private, max-age=60, s-maxage=60
that matters. Together with Date
header this roughly means that the content will be valid for 60 seconds since the response was received. Is it much? For frequently changed content — probably. For blog articles I’d prefer something more long-lasting…
And that is how we arrive to the last piece of nanoc-github . A faraday middleware to allow extending cache time. It is a quite primitive piece of code that substitutes max-age value to the desired one. For my particular needs I set this value 3600 seconds. The general idea is that we modify HTTP responses from API before they hit the cache. Then the cache middleware examines cache validity based on modified age, rather than original one. Simple and good enough. Just be careful to add this to middleware stack in correct order :sweat_smile:
class ModifyMaxAge < Faraday::Middleware def initialize(app, time:) @app = app @time = Integer(time) end def call(request_env) @app.call(request_env).on_complete do |response_env| response_env[:response_headers][:cache_control] = "public, max-age=#{@time}, s-maxage=#{@time}" end end end
And that’s it! I hope you found this article useful and learned a bit or two. Drop me a line on my twitter or leave a star on this project:
Happy hacking!
以上所述就是小编给大家介绍的《Practical use of Ruby PStore》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
图解物联网
[ 日] NTT DATA集团、河村雅人、大塚纮史、小林佑辅、小山武士、宫崎智也、石黑佑树、小岛康平 / 丁 灵 / 人民邮电出版社 / 2017-4 / 59.00元
本书图例丰富,从设备、传感器及传输协议等构成IoT的技术要素讲起,逐步深入讲解如何灵活运用IoT。内容包括用于实现IoT的架构、传感器的种类及能从传感器获取的信息等,并介绍了传感设备原型设计必需的Arduino等平台及这些平台的选择方法,连接传感器的电路,传感器的数据分析,乃至IoT跟智能手机/可穿戴设备的联动等。此外,本书以作者们开发的IoT系统为例,讲述了硬件设置、无线通信及网络安全等运用Io......一起来看看 《图解物联网》 这本书的介绍吧!