From the website:
Northern Light Enterprise Search Engine Features
Performance. With a database indexing 5 million documents totaling 25 gigabytes of content, and using a single query server with a single software installation, Northern Light is rated at 216 queries per second with a query response time averaging 80 milliseconds.
Scalability. Northern Light can search databases of more than 50 million documents with a single software installation on a single server. (Unlike some enterprise search engine vendors that require you to add another server appliance every time you want to add as few as 150,000 documents to your database.)
Relevance ranking effectiveness. Northern Light has a unique seventeen-factor approach to relevance ranking that considers statistical text measures, hyperlink analysis, subject classification, and date - and balances all these dynamically to weight the factors based on what will be most useful for a given query. What, you ask, are statistical text measures? Well, a few examples would be the number of times the query terms are in the document relative to the length of the document, the proximity of the query terms in the document, the word order of the query terms in the document, the presence of the query terms in the document metadata, and the inverse term frequency of the query terms in the database as a whole.
Automatic classification. Northern Light has patented, proprietary technology that classifies every document in the database by subject, type, language, and source. We provide a complete 17,000-node subject taxonomy developed by our expert gang of librarians that is extensible and customizable. Our classification powers advanced search forms, vertical search applications, and our patented Custom Search Folders™ for results navigation.
Flexible query parsing. Northern Light allows keywords, Boolean expressions (all operators, compound, and nested), natural language, phrase searching, wildcards, and any combination of these.
Search on any metadata. All metadata is represented in the index, which means you can use search forms or syntax to qualify the results. Search on title, sources, documents types, etc. You can add any metadata that makes sense to your organization and search on that tag.
Security. Northern Light integrates with your network authentication, and all security protocols are observed. That means users can only access information for which they're authorized, and you can easily add and remove users.
Open API. Our search engine has well-documented API's using J2EE standards, XML search results, and JSP sample code that support the integration of Northern Light into corporate applications.
Content integration. Our published load format specification allows any file type, from any source, located anywhere to be indexed and searched. The data conversion system includes filters for Microsoft Office, PDF, HTML (including JSP and ASP), and text formats including XML.
Discovery-based crawler. Northern Light's crawler follows links on your network to discover content for indexing. The crawler connects via HTTP, HTTPS, FTP, NFS and SMB (Windows) protocols and supports multiple authentication methods.
Administration tools. We provide a browser-based administration system that includes a basic search UI, a scheduler to manage crawling, data conversion, database loading, and a system configuration manager.
Platforms. Northern Light is available on LINUX.


