Data value storage for compressed semi-structured data
Tripney, Brian Grieve and Ross, Isla and Wilson, Francis and Wilson, John; Decker, H and Lhotská, Lenka and Link, Sebastian and Basl, Josef and Tjoa, A Min, eds. (2013) Data value storage for compressed semi-structured data. In: Database and Expert Systems Applications. Lecture Notes in Computer Science, 8056 . Springer-Verlag Berlin, Berlin, pp. 174-188. ISBN 9783642401725 (https://doi.org/10.1007/978-3-642-40173-2_16)
Preview |
PDF.
Filename: 80560174_2_.pdf
Preprint Download (951kB)| Preview |
Abstract
Growing user expectations of anywhere, anytime access to information require new types of data representations to be considered. While semi-structured data is a common exchange format, its verbose nature makes files of this type too large to be transferred quickly, especially where only a small part of that data is required by the user. There is consequently a need to develop new models of data storage to support the sharing of small segments of semi-structured data since existing XML compressors require the transfer of the entire compressed structure as a single unit. This paper examines the potential for bisimilarity-based partitioning (i.e. the grouping of items with similar structural patterns) to be combined with dictionary compression methods to produce a data storage model that remains directly accessible for query processing whilst facilitating the sharing of individual data segments. Study of the effects of differing types of bisimilarity upon the storage of data values identified the use of both forwards and backwards bisimilarity as the most promising basis for a dictionary-compressed structure. A query strategy is detailed that takes advantage of the compressed structure to reduce the number of data segments that must be accessed (and therefore transferred) to answer a query. A method to remove redundancy within the data dictionaries is also described and shown to have a positive effect on memory usage.
ORCID iDs
Tripney, Brian Grieve, Ross, Isla ORCID: https://orcid.org/0000-0003-0106-0745, Wilson, Francis and Wilson, John ORCID: https://orcid.org/0000-0002-5297-657X; Decker, H, Lhotská, Lenka, Link, Sebastian, Basl, Josef and Tjoa, A Min-
-
Item type: Book Section ID code: 44328 Dates: DateEvent14 August 2013PublishedSubjects: Bibliography. Library Science. Information Resources > Information resources > Electronic information resources Department: Faculty of Science > Computer and Information Sciences Depositing user: Pure Administrator Date deposited: 09 Jul 2013 10:13 Last modified: 16 Dec 2024 01:04 Related URLs: URI: https://strathprints.strath.ac.uk/id/eprint/44328