Show simple item record

dc.contributor.authorCheng, Zaihe
dc.contributor.authorShen, Wei
dc.contributor.authorFang, Wei
dc.contributor.authorLin, Jerry Chun-Wei
dc.date.accessioned2024-04-11T12:05:45Z
dc.date.available2024-04-11T12:05:45Z
dc.date.created2023-03-29T21:13:51Z
dc.date.issued2023
dc.identifier.citationComplex System Modeling and Simulation. 2023, 3 (1), 47-58.en_US
dc.identifier.issn2096-9929
dc.identifier.urihttps://hdl.handle.net/11250/3126095
dc.description.abstractHigh-utility itemset mining (HUIM) can consider not only the profit factor but also the profitable factor, which is an essential task in data mining. However, most HUIM algorithms are mainly developed on a single machine, which is inefficient for big data since limited memory and processing capacities are available. A parallel efficient high-utility itemset mining (P-EFIM) algorithm is proposed based on the Hadoop platform to solve this problem in this paper. In P-EFIM, the transaction-weighted utilization values are calculated and ordered for the itemsets with the MapReduce framework. Then the ordered itemsets are renumbered, and the low-utility itemsets are pruned to improve the dataset utility. In the Map phase, the P-EFIM algorithm divides the task into multiple independent subtasks. It uses the proposed S-style distribution strategy to distribute the subtasks evenly across all nodes to ensure load-balancing. Furthermore, the P-EFIM uses the EFIM algorithm to mine each subtask dataset to enhance the performance in the Reduce phase. Experiments are performed on eight datasets, and the results show that the runtime performance of P-EFIM is significantly higher than that of the PHUI-Growth, which is also HUIM algorithm based on the Hadoop framework.en_US
dc.language.isoengen_US
dc.publisherIEEEen_US
dc.rightsNavngivelse 4.0 Internasjonal*
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/deed.no*
dc.titleA Parallel High-Utility Itemset Mining Algorithm Based on Hadoopen_US
dc.typePeer revieweden_US
dc.typeJournal articleen_US
dc.description.versionpublishedVersionen_US
dc.rights.holder© The author(s) 2023en_US
dc.source.pagenumber47-58en_US
dc.source.volume3en_US
dc.source.journalComplex System Modeling and Simulationen_US
dc.source.issue1en_US
dc.identifier.doi10.23919/CSMS.2022.0023
dc.identifier.cristin2138298
cristin.ispublishedtrue
cristin.fulltextoriginal
cristin.qualitycode1


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

Navngivelse 4.0 Internasjonal
Except where otherwise noted, this item's license is described as Navngivelse 4.0 Internasjonal