Vis enkel innførsel

dc.contributor.authorLi, Xiang
dc.contributor.authorLi, Jiaxuan
dc.contributor.authorFournier-Viger, Philippe
dc.contributor.authorNawaz, M. Saqib
dc.contributor.authorYao, Jie
dc.contributor.authorLin, Jerry Chun-Wei
dc.date.accessioned2021-03-15T08:27:36Z
dc.date.available2021-03-15T08:27:36Z
dc.date.created2021-01-14T13:41:48Z
dc.date.issued2020
dc.identifier.citationLi, X., Li, J., Fournier-Viger, P., Nawaz, M. S., Yao, J., & Lin, J. C.-W. (2020). Mining Productive Itemsets in Dynamic Databases. IEEE Access, 8, 140122-140144.en_US
dc.identifier.issn2169-3536
dc.identifier.urihttps://hdl.handle.net/11250/2733292
dc.description.abstractDiscovering frequent itemsets is a data analysis task used in numerous domains. It consists of finding sets of items (itemsets) that frequently appear in a set of database records (also called transactions). Though discovering frequent itemsets is useful, it can produce a large amount of spurious patterns. As a result, the user may spend a great amount of time to analyze the itemsets found by a frequent itemset mining algorithm to find truly interesting patterns. Hence, in recent years, a key research topic has emerged which is to discover statistically significant patterns in databases. The most popular model for identifying itemsets that are statistically significant is to discover non-redundant productive itemsets. The state-of-the-art algorithm to extract this set of patterns is OPUS-Miner. A key drawback of that algorithm is that it is designed to be applied to a static database. Moreover, a second drawback of OPUS-Miner is that it discovers all patterns in a database. In other words, the user cannot search for itemsets containing some specific items. This paper addresses these issues by defining the novel problem of discovering targeted non redundant productive itemsets in dynamic databases. An algorithm named IDPI+ (Interactive Discovery of Productive Itemsets) is presented, storing transactions in a tree structure, which can then be interactively queried to identify productive and non redundant itemsets containing specific items. A structure named Query-Tree is also introduced to process many queries at the same time. Moreover, to handle dynamic databases, efficient transaction insertion and deletion algorithms are provided to update the tree. It was observed in an experimental evaluation on benchmark datasets containing various types of data that IDPI+ can handle thousands of queries per second on a desktop computer. Moreover, it was found that IPDI+ is more than an order of magnitude faster than a baseline algorithm.en_US
dc.language.isoengen_US
dc.publisherIEEEen_US
dc.rightsNavngivelse 4.0 Internasjonal*
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/deed.no*
dc.titleMining Productive Itemsets in Dynamic Databasesen_US
dc.typeJournal articleen_US
dc.typePeer revieweden_US
dc.description.versionpublishedVersionen_US
dc.source.pagenumber140122 - 140144en_US
dc.source.volume8en_US
dc.source.journalIEEE Accessen_US
dc.identifier.doi10.1109/ACCESS.2020.3012817
dc.identifier.cristin1871369
cristin.ispublishedtrue
cristin.fulltextoriginal
cristin.qualitycode1


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel

Navngivelse 4.0 Internasjonal
Med mindre annet er angitt, så er denne innførselen lisensiert som Navngivelse 4.0 Internasjonal