•  
  •  
 

International Journal of Computer Science and Informatics

Abstract

Text documents in the web are in hierarchy, increase in the content, information grows over the years. To classify those text documents, need a class labels. But documents in the corpus belong to more than one class or category. Most of the corpus is large in size example. Wikipedia, Yahoo ODP directory. To classify those large-Scale dataset need a multi-label to categorize those datasets. More number of document added to the hierarchy, it create very high imbalance between classes at the different levels of hierarchy. Difficult to assign the documents to the actual class, so that relevance measure is used to calculate, relevance of text document to the class label, to maintain stable hierarchy. Another issue is if number of unique label is increase, it create instability in a classification, and also slow the classification process, so that try to limit the unique label in the classification, it improves the classification performance.

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.