NeoCore: Powerful XML-DB for document management – Features & Architecture

XML is widely used as the de facto standard format for documents, but it possesses data structures that are difficult to manage with relational databases (RDB). In content management, XML is commonly used to describe metadata, and NeoCore excels at handling the large volumes of XML data generated in such environments. This section introduces the features of NeoCore, an XML database (XML DB) equipped with capabilities not found in traditional RDBs, including its architecture and ultra-fast search technology (DPP).

Detailed Overview of NeoCore’s Features

NeoCore is a completely new type of XML database engine that can handle XML data—anything, quickly, and easily. No schema definitions are required—NeoCore flexibly accommodates any XML data format. Thanks to its fully automatic indexing feature, it automatically creates indexes for all tags, eliminating the need for manual index design, which has traditionally been a major challenge for system developers. This dramatically improves development efficiency. Equipped with DPP (Digital Pattern Processing), NeoCore achieves ultra-fast XML searches. Without accessing the original data directly, it performs pattern matching on flat-structured icons, delivering stable search performance regardless of data volume or structural complexity. NeoCore stores XML data schemalessly, enabling data management that leverages XML’s inherent flexibility.

1. “Anything” — Effortlessly handle all XML data without the need for schema definitions

NeoCore, an XML database, does not require schema definitions when storing XML data. When storing or updating XML data, NeoCore performs only well-formedness checks (parsing) and does not conduct validation against DTDs or XML Schemas. Therefore, it can store any XML format data independently of DTD or XML Schema, allowing you to freely add or remove elements as needed. Relational databases (RDBs) require strict schema definitions for all stored data, meaning that any specification changes necessitate redefining the schema. In contrast, database design using NeoCore involves structuring nodes and attributes in a tree format and modifying this tree structure with performance considerations in mind. While RDBs require the structure design to be finalized before implementation, NeoCore allows for some degree of structural changes to be handled within the application, as long as the basic framework is established. This approach significantly reduces the system engineering costs and time associated with schema design and index redesign driven by business-side requirements. It offers major advantages, especially when the data structure cannot be precisely defined in the early stages of system development or when adding or modifying data fields during the system operation phase. Furthermore, these processes manage large volumes of schema-less XML data while maintaining ACID properties*, enabling robust XML data management with high reliability as a DBMS. ACID properties: The four key characteristics required for transaction processing, which manages multiple related operations as a single unit of work.

2. “Fast” — Ultra-Fast XML Search Performance

In NeoCore, the XML database (XML-DB), high-speed attribute-level searches—impossible with traditional full-text search engines—are made possible through the use of the W3C-standard query language XQuery and NeoCore’s proprietary DPP (Digital Pattern Processing) technology. This enables ultra-fast search performance, completely unaffected by the size or structure of the XML data. In contrast, when handling XML in relational databases (RDBs)—whether mapping XML to table structures or storing it in XML data fields—search performance can degrade significantly. With NeoCore, the DPP engine ensures stable, high-speed searches regardless of XML data volume or hierarchical depth. When XML data is stored, it passes through a parser and is then processed by a module called the Flattener, which breaks it down into paths and values. DPP then converts each path and value into a fixed 64-bit structure, generating a unique “icon” for every tag. This icon-based approach allows NeoCore to search without referring to the original XML structure. Instead, it pattern-matches the flattened icons—functioning as indexes—and uses them to pinpoint where the actual data is stored.

3. “Easy” — Fully Automated XML Indexing

XML-DB NeoCore features a Full Auto-Indexing capability that automatically indexes all XML data upon storage, along with automatic assignment of key IDs, significantly reducing development effort and lowering operational and maintenance costs. The Full Auto-Indexing feature eliminates the need for manual index design and creation, which is typically required in relational databases (RDBs). Developers no longer have to decide which fields to index; instead, they can focus solely on optimizing search methods that effectively leverage these indexes. This dramatically reduces development time and costs. When storing or updating data, NeoCore uses its patented DPP technology to break down the XML structure and then organizes and stores the data as binary, classified by type using a unique method—ensuring efficient and streamlined data management.

The internal structure of NeoCore is composed of and managed by the following multiple files.

  • Dictionary File: Stores the actual tag and data contents.
  • Index File: Stores index information for tags and data.
  • Duplicate File: Stores duplicate index information for tags and data.
  • Cross-Reference: Provides mutual referencing between the dictionary and index files.
  • Map File: Stores the physical structure information of the XML data.
  • Admin File: Contains administrative information such as database location and the number of stored documents.