IEEE/ICACT20230212 Slide.11        [Big Slide]       Oral Presentation
In this paper, the binary tree data structure is used to partition the data set, and the minimum information loss is the constraint condition for each partition, so as to reduce the information loss caused by k-anonymity. First, all records in the data set are regarded as an equivalent set, and all data attributes in the equivalence are minimized. Then, according to the distance calculation formula, the two farthest data records are selected as the cluster centers of the left and right equivalent subsets, and then the distance between other data records and the centers of the two clusters is calculated, which is converted into a 2-means classification problem. After partitioning, it is necessary to determine whether the constraint conditions are met. If not, the partition will be stopped. If the constraint conditions are met, the partition will continue. The binary tree clustering algorithm (BTCA) process is shown in the figure.

[Go to Next Slide]
Select Voice: