Search technologies

A range of core search technologies and search servers are available both as open source and as commercial products. Below is an overview of some of the search technologies we use.

Apache Lucene

Apache Lucene is a mature, high-performance and full-featured search engine library with a large and active developer and user community.

Apache Lucene supports ranked searching, powerful query types (phrase, wildcard, proximity, range and more), fielded search, sorting by any field, multiple-index searching with merged results, simultaneous updates and search and much more.

Additional to being a core search engine library, Apache Lucene also provides very high-quality language support. Japanese language support is provided by Kuromoji which is now a core part of Apache Lucene.

Apache Solr

Apache Solr is the very popular search server built atop the Apache Lucene search engine library.

Apache Solr provides REST-like interfaces and adds a number of very useful capabilities such as a schema with types and keys, extensions to the Apache Lucene query langauge, faceted search and filtering, geospatial search, configurable text analysis, easy XML configuration, a web-based administration interface and a whole lot more.

Apachr Solr is mature and easy to use and is being used by leading businesses such as Apple, Instagram, Netflix, The Guardian, Disney and many more.

elasticsearch

elasticsearch is another search server built atop Apache Lucene.

elasticsearch is document oriented, schemaless and provides REST/JSON interfaces. Its designed for simple scalable and fault-tolerant deployment in a cloud-like fashion and also supports multi-tenancy.

Japanese language support based on Kuromoji is available for elasticsearch as a plugin.

FAST ESP

ESP is commercial search engine developed by Fast Search & Transfer (FAST), which is now part of Microsoft.

The ESP search engine was one of the leading search engines a few years ago and still powers many major search services globally even though it is now a legacy technology as it's not actively developed by Microsoft.

We have extensive experience with ESP having spent 10 years with FAST/Microsoft.

MarkLogic Server

MarkLogic is a commerical XML database that provides XQuery support with additional free-text search capabilities and big data characteristics.

MarkLogic Server allows indexing very large amounts of XML data and use MarkLogic Server both as a storage system and as a search engine that opens up new ways of building applications. XQuery provides support for very complex queries and data transformations and with its free-text search capabilities, it's an very compelling technology for building applications.

MarkLogic Server is multi-tenant and also offers a REST interface with JSON support in the latest version. Searching Japanese is supported using morphological analysis and n-gramming. MarkLogic Server also integrates with Apache Hadoop.