Fuzzy high-utility pattern mining in parallel and distributed Hadoop framework

Wu, Jimmy Ming-Tai; Srivastava, Gautam; Wei, Min; Yun, Unil; Lin, Jerry Chun-Wei

dc.contributor.author	Wu, Jimmy Ming-Tai
dc.contributor.author	Srivastava, Gautam
dc.contributor.author	Wei, Min
dc.contributor.author	Yun, Unil
dc.contributor.author	Lin, Jerry Chun-Wei
dc.date.accessioned	2021-10-07T13:13:26Z
dc.date.available	2021-10-07T13:13:26Z
dc.date.created	2021-01-18T14:12:20Z
dc.date.issued	2021
dc.identifier.citation	Wu, J. M.-T., Srivastava, G., Wei, M., Yun, U., & Lin, J. C.-W. (2021). Fuzzy high-utility pattern mining in parallel and distributed Hadoop framework. Information Sciences, 553, 31-48.	en_US
dc.identifier.issn	0020-0255
dc.identifier.uri	https://hdl.handle.net/11250/2788432
dc.description.abstract	Over the past decade, high-utility itemset mining (HUIM) has received widespread attention that can emphasize more critical information than was previously possible using frequent itemset mining (FIM). Unfortunately, HUIM is very similar to FIM since the methodology determines itemsets using a binary model based on a pre-defined minimum utility threshold. Additionally, most previous works only focused on single, small datasets in HUIM, which is not realistic to any real-world scenarios today containing big data environments. In this work, the fuzzy-set theory and a MapReduce framework are both utilized to design a novel high fuzzy utility pattern mining algorithm to resolve the above issues. Fuzzy-set theory is first involved and a new algorithm called efficient high fuzzy utility itemset mining (EFUPM) is designed to discover high fuzzy utility patterns from a single machine. Two upper-bounds are then estimated to allow early pruning of unpromising candidates in the search space. To handle the large-scale of big datasets, a Hadoop-based high fuzzy utility pattern mining (HFUPM) algorithm is then developed to discover high fuzzy utility patterns based on the Hadoop framework. Experimental results clearly show that the proposed algorithms perform strongly to mine the required high fuzzy utility patterns whether in a single machine or a large-scale environment compared to the current state-of-the-art approaches.	en_US
dc.language.iso	eng	en_US
dc.publisher	Elsevier	en_US
dc.relation.uri	https://doi.org/10.1016/j.ins.2020.12.004
dc.rights	Navngivelse 4.0 Internasjonal	*
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/deed.no	*
dc.subject	Hadoop	en_US
dc.subject	high fuzzy utility pattern	en_US
dc.subject	high utility itemset mining	en_US
dc.subject	big-data	en_US
dc.subject	fuzzy-set theory	en_US
dc.subject	MapReduce	en_US
dc.title	Fuzzy high-utility pattern mining in parallel and distributed Hadoop framework	en_US
dc.type	Peer reviewed	en_US
dc.type	Journal article	en_US
dc.description.version	publishedVersion	en_US
dc.rights.holder	© 2021 The Author(s)	en_US
dc.subject.nsi	VDP::Matematikk og Naturvitenskap: 400::Informasjons- og kommunikasjonsvitenskap: 420	en_US
dc.source.pagenumber	31-48	en_US
dc.source.volume	553	en_US
dc.source.journal	Information Sciences	en_US
dc.identifier.doi	10.1016/j.ins.2020.12.004
dc.identifier.cristin	1873324
cristin.ispublished	true
cristin.fulltext	original
cristin.qualitycode	2

Tilhørende fil(er)

Filnavn:: Wu.pdf
Størrelse:: 826.3Kb
Format:: PDF

Åpne

Denne innførselen finnes i følgende samling(er)

Import fra CRIStin [3581]
Institutt for datateknologi, elektroteknologi og realfag [1160]

Vis enkel innførsel

Med mindre annet er angitt, så er denne innførselen lisensiert som Navngivelse 4.0 Internasjonal