This work involved assessing the feasibility of utilizing different NoSQL databases in handling a huge tree data structure with heterogeneous nodes in which heterogeneity implies that each node can embody a unique attribute set. It is a prominent requirement arising in structured log analysis where constituents in a software log file are scrutinized hierarchically. Traditional pills from relational databases fail in handling this efficiently. We lean towards NoSQL paradigm, which has been emerging as a prominent solution for dealing with high volumes of data with localized characteristics. Our exploration probes five different NoSQL models: wide column store, document store, tuple store, graph databases and multi;model databases that collectively account for a large fraction of the entire NoSQL spectrum.

An experiment is designed to measure database performance against a generic tree API focusing on node insertion, node query and attribute;value query. The API is then implemented in a database selected from each of the five NoSQL models in concern. Implementations are used for testing the database performance with respect to the three operations by measuring time taken for a batch of similar operations in a machine with average hardware and software configuration.

Collaborated with: University of Colombo School of Computing