The framework

Administration

Our interface allows you to manage the connectors, and the connection to your LDAP/AD if necessary. It can also monitor the system status and the documents retrieval status.

Security

We provide a graphical interface to manage access rights to searched documents, with a connection to the AD / LDAP setup in your company.

Load management

Manage the load on your data sources, in terms of threads, retrieved documents, documents size. Time window management for the crawling.

Processing filters

Possibility to create document processing filters, for instance with regular expressions to include or exclude documents or folders.


Connectors

File shares

Index your file shares(Netapp, windows, samba, Dropbox...), securely. Manage the OCR. Manage many formats (ppt, xls, html, jpeg, MS Office, open office...)

CMS and portals

Index your CMS (Content Management System), ECM (Enterprise Content Management) or portals (Liferay, Alfresco, Sharepoint, Documentum, Filenet, CMIS...), securely.

Emails

Index our emails, to securely leverage this knowledge mine. Direct connection to the email server (posfix, exchange) or retrieval via IMAP/POP3.

All that's left

Databases, social networks, ... Plugin mechanism to develop new connectors. You can either create them by yourself, or rely on our know-how.

Apache ManifoldCF

It is project from the Apache Foundation. Initially created to enrich Apache Lucene/Solr with a connectors framework, the community decided to create a standalone project. You can get more info on the website of Apache ManifoldCF. Apache ManifoldCF takes care of retrieving data in many different formats from diverse data sources. It proposes a connector plugin mechanism to add new connectors. It managas ACLs and access rights. It is available in Apache v2 licence.