What’s New in Solr 8

栏目: IT技术 · 发布时间: 4年前

Apache Solr , Fusion Tips , Lucidworks Fusion

What’s New in Solr 8

A roll-up of the latest features in Apache Solr 8

byCassandra Targett on April 6, 2020

What’s New in Solr 8

The Lucene PMC recently announced the release of version 8.5 . We’re several releases into the 8.x release line now, and with a cadence of a new release every 8-10 weeks, you’d be forgiven if you hadn’t kept up with the latest features.

As there have been a number of exciting changes in Solr in the past few releases, I thought it would be helpful to review a few that have been released in 8.3-8.5.

Package Manager

Solr users have long wished for a proper plugin ecosystem. For those who build their own query parsers and other types of customizations, deploying their code has been cumbersome at best. It’s also never been possible to easily share your code with others who may benefit from similar changes.

Starting in 8.4, we have a package management system in Solr that allows for hot deployment of plugins across all nodes of a cluster, secure signing of plugins from trusted remote repositories, and consistent packaging guidance for plugins. One key feature of hot deployment is the ability to add or upgrade plugins without having to manually move .jars and then restart every node of the cluster.

The Lucene/Solr community hopes to remove from Solr some features that have long lived as “contribs” (like Solr Cell, DataImportHandler, Carrot, etc.) to allow them to become plugins maintained by people passionate about them who can give them the attention they deserve. If you would be interested in taking one of these on, please send a mail to the development mailing list dev@lucene.apache.org (be sure to subscribe to the list to see replies!).

Caches

Solr 8.3 added a brand new cache implementation, CaffeineCache, which we expect to provide most users with a lower memory footprint, a higher hit ratio, and better multi-threaded performance.

CaffeineCache is based on the Caffeine caching library, which by default uses a “Window Tiny Least Frequently Used” (W-TinyLFU) eviction policy. This allows eviction based on both frequency and recency of use.

What’s New in Solr 8

This implementation will become the default in Solr 9 and all other existing cache implementations will be removed at that time. Caching has a direct impact on the performance of your Solr installation, so it’s best to try to plan for the future by trying it out in a dev or QA environment.

Security

Any discussion of recent changes must mention the many changes made in Solr 8.4 and 8.5 to improve Solr’s default security position. Our goal has been to make Solr more secure out of the box.

My colleague Erik Hatcher covered the changes in 8.4 in his excellent post, Default Security in Solr 8.4 .

In 8.5, a few more features have been added, namely the ability to run Solr with a Java Security Manager enabled, and the ability to whitelist or blacklist IP addresses or ranges from being able to access any Solr interface (UI or API).

Indexing Log Files

Solr’s logs contain a wealth of information but in a production system they can be difficult to read. There’s so much going on, it’s hard to separate the signals and the noise.

Starting with 8.5, Solr has a simple way to index its own log files into a Solr collection with a new wrapper script in the bin/ directory called postlogs . This script parses the log files and indexes them to a collection of your choice.

Once they are indexed, you can query them for errors or patterns. Visualizing the system activity with something like Apache Zeppelin when trying to diagnose a problem can be incredibly powerful – how many commits are you really doing? How slow are the queries users are complaining about? Are those outliers or evidence of a persistent problem?

There’s obviously the potential to create an infinite loop if you index Solr’s logs to itself continually. This tool is intended for troubleshooting and not for monitoring.

New Delete-by-Query Approach

Delete-by-query operations can be very expensive, particularly with distributed collections. Best practice advice is usually to avoid them in a busy production system. The reason for this is that they block all other document updates while the query is executed and the results processed. This is done to ensure out-of-order updates and optimistic concurrency constraints are properly processed. A side effect, however, is that in some cases, other updates can queue up and eventually cause replicas to go into recovery.

There are times when you may not care about preserving document order and version consistency, you just want the documents to be deleted. For example, if you want to purge the index of all documents older than 60 days, you may know that no incoming documents will have a timestamp later than today, so there is no need to block the entire indexing stream and potentially cause severe downstream consequences.

To provide an alternative, a new stream decorator delete() has been added which operates similarly to the long-existing update() decorator by wrapping a streaming expression. This decorator allows a faster delete-by-query that is non-blocking – every tuple output by the inner stream includes a document ID which can be quickly deleted from the index. As an added benefit, the full extent of streaming expression syntax is available for identifying the documents to be deleted.

Important Notes for Upgrades

There are two important changes to note if you’re upgrading to 8.4 or 8.5 from a version before 8.4.

Non-Default Codec Change (8.4)

First, if you have defined the postingsFormat or docValuesFormat parameter in any field or field type definition to a non-default codec (if you are using the Tagger Handler, for example, you would have done this), you will have to perform a bit of surgery to be able to use Solr after upgrade.

SolrCloud Overseer Queue Format Change (8.5)

Second, if you are using SolrCloud, you’ll want to take care during the upgrade to 8.5 due to a change in the format used for elements in the Overseer queues and maps. There are no configuration changes and you should otherwise notice no difference, but you’ll want to follow the suggestions in the Solr Upgrade Notes carefully for a successful upgrade.

Remove the friction of upgrades with Lucidworks Managed Search

Lucidworks offers Apache Solr as a managed service for those who want to avoid the hassle of upgrading and managing infrastructure, but still want to leverage all of Solr’s latest and greatest features. Lucidworks Managed Search is available now for preview. Learn more in my colleague Marcus Eagan’s recent blog post

Resources


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

走出软件作坊

走出软件作坊

阿朱 / 电子工业出版社 / 2009-1 / 39.80

《走出软件作坊》这本书提供了解决国内小型IT企业发展的过程中会遇到的项目管理问题的若干方法。主要以作者自身多年工作的宝贵经验,来谈软件公司的项目管理和团队建设,包括对中小软件公司软件开发组织结构、团队文化、软件过程管理、团队激励、绩效考核、职业发展规划、未来业界发展趋势、个人素质提升等,具有实际指导意义。主要读者对象是IT企业的研发主管、项目经理和软件开人中同,以及即将到IT企业工作的高校毕业生。一起来看看 《走出软件作坊》 这本书的介绍吧!

MD5 加密
MD5 加密

MD5 加密工具

Markdown 在线编辑器
Markdown 在线编辑器

Markdown 在线编辑器

UNIX 时间戳转换
UNIX 时间戳转换

UNIX 时间戳转换