Take the Next Step

Request more information to make the decision that’s right for you:

  • Courses and Degree Outcomes
  • Admission Requirements
  • Tuition and Financial Aid
  • Student Support Services

Access Your Guide

Continue to Next Step

Step 1 of 2

Back | Step 2 of 2

Submitting this form constitutes your express written consent to be called and/or texted by Worcester Polytechnic Institute at the number(s) you provided, regarding furthering your education. You understand that these calls may be generated using an automated technology, including by way of example, auto-dialer and click-to dial technologies. Calls may be recorded for quality assurance and training purposes. Privacy Policy.

We were a team of four students—all MS in Data Science. Our goal for the GQP was to analyze the knowledge base articles to determine what is covered, where is the duplication, and strategize how to improve the documentation. We were provided with the dataset; the corpus had about 19000+ documents for different articles. Our approach was to apply unsupervised learning on the data; the architecture of the model was to get the tf-idf and reduce the dimension by applying PCA and cluster the m in low dimension.

For each cluster we applied the topic modeling to extract the context and came up with different topics/area that various documents in the corpus consisted of. This analysis gives a great value-add to the business—to understand if there are redundancy in the documentations, to see where they need to improve their documentations, and understand how can it be improved and made more efficient for customers using their products. The right documentation reduced time and money for the business in various ways.

2019 Spring

Project Sponsor

KGF, Karnataka, India

Faculty Mentor
Prof. Fatemeh Emdad

The exposure to the real world data science problem and working with mentors on-site and guiding us throughout the project was a great learning experience; really, it was the most valuable part. The experience adds great value to my resume for my prospect carrier in the field of data science and gives me a sense on how to approach data science problems from scratch. After I graduate, I want to work in the field of data science to contribute and learn more. If everything works well, I might come back for PhD.