Sampling-based visual assessment computing techniques for an efficient social data clustering

Visual methods were used for pre-cluster assessment and useful cluster partitions. Existing visual methods, such as visual assessment tendency (VAT), spectral VAT (SpecVAT), cosine-based VAT (cVAT), and multi-viewpoints cosine-based similarity VAT (MVS-VAT), effectively assess the knowledge about th...

Full description

Saved in:

Bibliographic Details
Published in:	The Journal of supercomputing Vol. 77; no. 8; pp. 8013 - 8037
Main Authors:	Basha, M. Suleman, Mouleeswaran, S. K., Prasad, K. Rajendra
Format:	Journal Article
Language:	English
Published:	New York Springer US 01-08-2021 Springer Nature B.V
Subjects:	Clustering Compilers Computer Science Computing time Interpreters Memory management Mobile and Intelligent Sensing on High Performance Computing Processor Architectures Programming Languages Sampling Feature extraction Cluster tendency Social data clustering Scalability Visual methods
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Visual methods were used for pre-cluster assessment and useful cluster partitions. Existing visual methods, such as visual assessment tendency (VAT), spectral VAT (SpecVAT), cosine-based VAT (cVAT), and multi-viewpoints cosine-based similarity VAT (MVS-VAT), effectively assess the knowledge about the number of clusters or cluster tendency. Tweets data partitioning is underlying the problem of social data clustering. Cosine-based visual methods succeeded widely in text data clustering. Thus, cVAT and MVS-VAT are the best suited methods for the derivation of social data clusters. However, MVS-VAT is facing the problem of scalability issues in terms of computational time and memory allocation. Therefore, this paper presents the sampling-based MVS-VAT computing technique to overcome the scalability problem in social data clustering to select sample inter-cluster viewpoints. Standard health keywords and benchmarked TREC2017 and TREC2018 health keywords are taken to extract health tweets in the experiment for illustrating the performance comparison between existing and proposed visual methods.
ISSN:	0920-8542 1573-0484
DOI:	10.1007/s11227-021-03618-6