CSE 626 Information Retrieval Systems (3 credits)

Catalog description:

Introduction to Information storage and Retrieval (IR). Indexing, clustering, signature generation. Retrieval approaches: inverted files, cluster based retrieval, signature files, hypertext, and multimedia systems. Special hardware for IR. Web-based IR, and information filtering. 

Prerequisite:

CSE 274 / CSE 606 or equivalent.

Required topics (approximate weeks allocated):

  • Information Systems (1)
    • Overview of Information Systems
    • Information Retrieval Systems (IRSs)
    • Differences and Similarities between DBMSs and IRSs
  • Automatic Content Analysis and Indexing (2)
    • Feature Selection
    • Indexing Aims (Precision and Recall)
    • Indexing Aids
    • Term Weighting
    • Stemming
    • Spelling Correction
    • Indexing for Image Databases
  • Document Clustering (2)
    • Clustering Algorithms
    • Algorithm Implementation
  • Document Retrieval Methods (3)
    • Query: Document Matching Functions
    • Cluster Based Retrieval
    • Inverted File-based Retrieval
    • Multi-attribute Retrieval Methods
    • PAT trees and arrays
    • Signature Files
    • Hypertext Systems
    • Multimedia Systems
    • Pros and Cons of Different Approaches
  • System Performance (2)
    • Effectiveness Considerations
    • Efficiency Considerations
    • Case Studies
  • Web-based Information Retrieval (1.5)
    • Information Filtering (Selective Dissemination of Information)
    • Resource Discovery in Web
    • Automatic User Profile Update
  • Hardware Approaches to Information Retrieval (1.5)
    • Hardware Aids to Text Searching
    • Pattern Matching Using Distribute Array Processors
    • Clustering Using Distributed Array Processors
    • Connection Machine
    • Hardware for Signature File Searching
  • Other Topics Related to Information Retrieval (1)
    • Text Compression
    • Text Encryption
  • Exams/Review (1)