TeraText® Database System
TeraText Database System includes all of the necessary functionality for storing, managing, searching and retrieving a large collection of text documents.
Outstanding Performance, Scalability, and Reliability
Information inserted into the database is instantly available for search and retrieval. The system scales to support a thousand interactive updates per second while continuing to allow thousands of end users to access the collection.
Instant Access to Information
Information inserted into the database becomes instantly available for search and retrieval. There is no down-time while the database is being updated.
Unsurpassed Indexing / Retrieval Speed for Structured Text Documents
TeraText DBS scales to support over a thousand interactive updates per second while continuing to allow thousands of end users to access the collection. XML is stored natively to eliminate the time-consuming process of document decomposition and reconstruction.
Scales to Index & Query Text Collections from Gigabytes to Multi-Terabytes
TeraText DBS was designed to support distributed search and retrieval from small to very large text collections handling both static and real time collections. TeraText DBS utilizes a single logical view to provide access to the physical collection of databases. For large collections, the database is generally distributed to many smaller physical databases. These databases can either be appended together to form one database or the collection can be aliased together. This allows you to create, manage and search multi-terabytes or more of information.
Note: Our largest deployed system currently holds several billion XML documents (8+ terabytes). In this implementation, the TeraText DBS inserts and indexes up to 1,000 documents per second. The information is immediately searchable by the end user. A complex full text search across the entire collection can be accomplished in seconds. We have a team of experienced developers who will work with you to deliver total solutions, and offer a full training package to enable your own developers “to get up to speed.”
Survives Server Failures
TeraText DBS is designed to automatically recover from unexpected problems. Power failure? OS crash? No problems: the TeraText DBS will restart without losing a single record, ready to resume normal operations.
Minimizes Storage Requirements
TeraText DBS uses sophisticated compression techniques. Compressing the text minimizes the size of the data files, and specialized index compression techniques enable ultra-fast text searching. In many instances, the storage requirements for the indices + documents are often no larger than those of the original collection.
Flexible Integration with a Modular, Standards-Based System
TeraText DBS components are modular and can be installed as a suite or as individual modules to work with existing database management and document-authoring systems.
Supports XML, SGML, Unicode, Z39.50, HTTP and Other Industry Standards
TeraText DBS is based on open standards. Leading text and document standards are supported to ensure that TeraText-based solutions have a long life and can co-exist with current and future infrastructure.
Unique Applications Server Provides Immediate Access to any TeraText Database
TeraText DBS supports plug and play modules for complex value added web services.
Built on the Z39.50 Standard — the Library of Congress Standard Protocol for Information Retrieval
This is the only worldwide industry standard protocol for information retrieval in a distributed environment. This protocol allows TeraText DBS to scale to support multi-terabyte collections.
Provides A Rich Development Environment that Includes Java, C++, and .NET® APIs
Custom applications are a snap thanks to an extensive suite of libraries that provide ingest, indexing, searching, retrieval and many other capabilities.
Comprehensive Security Features
TeraText DBS provides role-based access to data at the field, record, and database levels. This enables an administrator to restrict access to sensitive data down to the level of specific XML nodes. TeraText DBS has a very strict security model, designed to prevent unauthorized users from even being aware of the existence of sensitive data. Other security features include support for Lightweight Directory Access Protocol (LDAP), Kerberos and the Generic Security Service (GSS), and Secure Sockets Layer/Transport Layer Security (SSL/TLS) to identify, authenticate, and authorize users and protect and encrypt sensitive information.
TeraText DBS as an XML-capable product was designed to store, retrieve and manipulate semi-structured text. By storing native XML (and its predecessor SGML), you get back what you put in. There is no time-consuming document decomposition or reconstruction required. Documents remain intact for faster updates and quicker access. The system also indexes all or part of the document using XML standards, enabling complex and comprehensive searching. In addition to storing XML natively, TeraText DBS can store alongside that XML other fielded data such as filenames, time stamps, and arbitrary binary data (for example, a native Word or PDF document from which the XML content was derived). This allows applications to take advantage of powerful XML capabilities without altering authoritative XML data that is created in other environments or tools.
Supports Complex Searches
TeraText DBS has integrated support for an extensive array of search capabilities including:
- Full text and fielded
- Proximity operators (near, order)
- Text structure operators (with [in same paragraph], same [in same sentence])
- Range operators (string, numeric)
- Fuzzy match, stemming, weighted
- Limit operations
- Custom case folding, punctuation stripping, transformations, expansions, etc.
- Boolean operators (and, or, not, xor)
- Wildcards for characters and words (#, #n, ?, ?n)
- Relevance ranked search
- Index scan operations to search the index
- Hit highlighting
- Saved searches
Questions? Contact Us
For additional information about the TeraText suite of products, please contact us today.