植基於頻繁項目集合與群集技術之模糊關聯法則探勘法之研究

傳送到

題名:
植基於頻繁項目集合與群集技術之模糊關聯法則探勘法之研究
著者: 郭政煌; Guo, Zheng-huang
李建億; Chien-i Lee; 數位學習科技學系
主題: 關聯法則探勘; 叢集分析; 模糊理論; Cluster; Association Mining; Fuzzy
描述: 現在電腦已經成為人們不可或缺的工具，各式各樣之紀錄都儲存在資料庫。例如：便利商店的交易資料、百貨公司的交易資料以及線上拍賣網站的交易資料、人口普查的基本資料等。以資料探勘之技術在龐大的交易資料中來挖掘出隱含而有意義之資訊一直是熱門的議題。如何在量化的交易資料中界定合適的模糊區間來產生最大量的頻繁項目集合，進而挖掘出更多有意義的規則以供決策使用十分重要。基因演算法(genetic algorithm, GA)可以找出精準的模糊區間，然而GA最大的缺點是花費過多的時間在訓練，以及無法考慮到資料本身的關聯特性。若使用傳統方式來均等切割分佈的區間，則無法確定切割的範圍是否適合資料庫特性。因此，本論文提出使用k-mean演算法來取得模糊切割區間的中心點，加上平均分佈的模糊切割半徑，來產生模糊歸屬函數，進而探勘頻繁項目集合。另外，考慮到項目之間的關聯性，本研究提出(frequent-based clustering for fuzzy association, FBCFA)，FBCFA與CFA不同之處在於，FBCFA使用權重式k-mean演算法，並且採用k-mean與頻繁項目探勘同時並行之技術，最後同時找出模糊切割區間與頻繁項目集合。由實驗結果顯示FBCFA所找出的頻繁項目集合數量優於CFA。
The computer has become an essential tool, and a variety of records are stored in a database. For example, transaction information of a convenience store, transaction information of a department store, transaction information of an online shop, and the basic information of a census. In a huge amount of transaction data, data mining techniques to explore the hidden and useful information have always been a hot topic. It is very important to identify a suitable fuzzy membership function which can generate a lot of frequent item sets, and then we can find more useful rules for decision-making. Genetic Algorithms (GA) is able to locate accurate fuzzy membership function; however, the biggest deficiencies of GA are wasted training time and inconsiderate of the relation property of data. If we use the traditional method to cut off sections of a distribution averagely, and we can not make sure if the cut range is appropriate for the properties of the database. In this study, we make use of k-mean algorithm to obtain the centroid of fuzzy section which has been cut, and then we can explore the frequent item sets by adding the radius of fuzzy cutting which generate fuzzy membership functions. In terms of the relation among items, the study proposes a method which is named FBCFA (frequent-based clustering for fuzzy association). The difference between FBCFA and CFA is that k-mean algorithm is applied to FBCFA, and FBCFA adapts to the mining technology of k-mean algorithm and frequent item sets together. At last, we figure out the sets of fuzzy cutting sections and frequent item sets, and the experimental results show that the number of the frequent item set of FBCFA is more than the number of the frequent item set of CFA.
碩士
建立日期: 2007
格式: 121 bytes
text/html
語言: 中文
識別號: http://nutnr.lib.nutn.edu.tw/handle/987654321/3911
資源來源: NUTN IR

連結

連回原始介面

植基於頻繁項目集合與群集技術之模糊關聯法則探勘法之研究

郭政煌; Guo, Zheng-huang 李建億; Chien-i Lee; 數位學習科技學系 2007

正在檢索遠程資料庫，請稍等