Decoding
Academic Pathways.
Framework
Django ORM & REST
NLP Engine
Scikit-learn (NLTK)
Algorithm
TF-IDF + Cosine Similarity
The Data Harvest
METU Portal
UNSTRUCTURED WEB DATA
Parsing Engine
REGEX & BS4 PIPELINE
Django DB
STRUCTURED COURSE ENTITIES
Automated parsing of 5000+ course descriptions, credits, and prerequisites from METU's official curriculum portals.
The Semantic Core
Vector Space Transformation
Academic interests are rarely limited to exact keywords. To handle this, we transform course descriptions into a high-dimensional vector space. Every word becomes a coordinate, and every course becomes a unique vector.
STEP 02
Cosine Similarity Matching
When a student enters a query, we calculate the angle between the query vector and every course vector in our database. A smaller angle indicates higher semantic relevance.
Result Accuracy
Introduction to HCI
Core principles of human-computer interaction, focused on user-centered design and iterative prototyping.
Urban Planning Studio
Designing smart city environments using participatory mapping and spatial data analysis tools.
Digital Design Research
Exploring the intersection of human behavior and digital environments through qualitative research.
From Keyword to Meaning
Traditional search engines rely on exact matches—if you search for "Human Interaction," you might miss a course titled "User Experience."
By implementing TF-IDF (Term Frequency-Inverse Document Frequency), we prioritize unique academic terms while dampening common words. This ensures that a student's vague interest is transformed into precise, data-driven recommendations.
relevance = dot_product(v_query, v_course) / (norm(v_query) * norm(v_course))
results = sorted(courses, key=lambda x: cosine_similarity(query, x), reverse=True)