Datafari is able to index hundreds of millions of documents, using a hadoop like big data architecture, on several machines.
In distributed mode, the Zookeeper technology and Solrcloud allow for an automatic management of system failures.
Near realtime management, multi search field data types (int, string, date...), schema-less mode, possibility to add dynamic fields.
Data transmitted from the crawling connectors to the search engine are sent via https with encryption, when the crawler and the engine are separated.
Solr is a web layer, based on Lucene. It adds functionnalites such as: web server, clustering, web pages for administration. It is not a complete search system able to connect to data and display results. It is a reliable backbone, able to scale through machines clustering, easily manageable, and supporting REST calls. You can get more information on the olr technical websiteS. Like Lucene, Solr is part of the Apache Lucene project. As such, it is available under the Apache v2 license. Solr is the reference open source search engine. It proposes advanced fucntionalities, can be easily configured, and is thus a fierce competitor against proprietary search technologies.