Tetyana Loskutova is a freelance analyst programmer and a business PhD candidate. She has a keen interest in philosophy and artificial intelligence and is looking forward to combining the insights from these disciplines for solving social problems. Tetyana lives in Johannesburg, South Africa. When she is not working or studying, Tetyana enjoys running, swimming, dancing and traveling.
Project: WikiClassify: Content- and Link-Based Classification
Goal of the project:
The creation of a Wikipedia articles' classifier to accommodate both the content of an article and its links with the other articles.
Summary of work:
The solution was based on the iterative classification algorithm:
- Assign "guessed" categories to data using content classifier.
- Compare the categories with the nearest neighbors (linked articles) using relational classifier.
- Retrain the relational classifier based on the content classifier categories and record the adjustments in the conditional map.
Results and future work:
The full ICA was not completed. Instead, the outputs of the content and relational classifiers were used based on their estimated accuracies. The use of the two classifiers allowed me to account for the dynamically changing content and the need for the creation of new category labels. The future should focus on meaning extraction from a variety of other sources.
The data sources used in this project, along with code for collecting such data from Wikipedia and test data sources, are provided at "WikiClassify: Content- and Link-Based Classification."