Course Information:

  • Course: IDS 566 - Advanced Text Analytics for Business
  • Term: Spring 2022
  • Time: Wed 3:00pm - 5:30pm
  • Location: Lecture Center D005
  • Contact: yuhenghu at uic dot edu

Overview

Given the vast amount of textual information nowadays, it becomes increasingly critical to mine high-quality information from the text. The mined patterns from text are import for many applications, including business intelligence, information acquisition, behavior analysis and decision making. In this course, we will cover several important topics in text mining including: basic natural language processing techniques, document representation, text categorization and clustering, information extraction, and sentiment analysis. We will also provide opportunities to gain hands-on experience of handling large-scale data set.

Academic Integrity

You are expected to adhere to the highest standards of academic honesty. Unless otherwise specified, collaboration on assignments is not allowed. Use of published materials is allowed, but the sources should be explicitly stated in your solutions. Violations will be reviewed and sanctioned according to the University Policy on Academic Integrity. Collaborations among team members are only allowed for the final term projects that are selected. "Academic integrity is the pursuit of scholarly activity free from fraud and deception and is an educational objective of this institution. Academic dishonesty includes, but is not limited to, cheating, plagiarizing, fabricating of information or citations, facilitating acts of academic dishonesty by others, having unauthorized possession of examinations, submitting work for another person or work previously used without informing the instructor, or tampering with the academic work of other students." For more information about violations of academic integrity and their consequences, consult http://vcsa.uic.edu/

Prerequisites

Programming Background (IDS 400 or 401) and experience in Algorithm and data structures, Probability, Statistics, and Matrices, and Database query language such as SQL are required. Experience in data mining (IDS 572) is highly recommended.

Recommended textbooks

Weekly Schedule

Week 1 Introduction and how to get and represent text
Week 2 Classification models
Week 3 Clustering and topical models
Week 4 Word embeddings and language models
Week 5 Sentiment analysis I
Week 6 Sentiment analysis II
Week 7 Text analytics for busisness: case studies
Week 8 Final presentation