Course Information:
- Course: IDS 566 - Advanced Text Analytics for Business
- Term: Spring 2022
- Time: Wed 3:00pm - 5:30pm
- Location: Lecture Center D005
- Contact: yuhenghu at uic dot edu
Overview
Given the vast amount of textual information nowadays, it becomes increasingly critical to mine high-quality information from the text. The mined patterns from text are import for many applications, including business intelligence, information acquisition, behavior analysis and decision making. In this course, we will cover several important topics in text mining including: basic natural language processing techniques, document representation, text categorization and clustering, information extraction, and sentiment analysis. We will also provide opportunities to gain hands-on experience of handling large-scale data set.
Academic Integrity
You are expected to adhere to the highest standards of academic honesty. Unless otherwise specified, collaboration on assignments is not allowed. Use of published materials is allowed, but the sources should be explicitly stated in your solutions. Violations will be reviewed and sanctioned according to the University Policy on Academic Integrity. Collaborations among team members are only allowed for the final term projects that are selected. "Academic integrity is the pursuit of scholarly activity free from fraud and deception and is an educational objective of this institution. Academic dishonesty includes, but is not limited to, cheating, plagiarizing, fabricating of information or citations, facilitating acts of academic dishonesty by others, having unauthorized possession of examinations, submitting work for another person or work previously used without informing the instructor, or tampering with the academic work of other students." For more information about violations of academic integrity and their consequences, consult http://vcsa.uic.edu/
Prerequisites
Programming Background (IDS 400 or 401) and experience in Algorithm and data structures, Probability, Statistics, and Matrices, and Database query language such as SQL are required. Experience in data mining (IDS 572) is highly recommended.
Recommended textbooks
- Mining Text Data. Charu C. Aggarwal and ChengXiang Zhai, Springer, 2012.
- Introduction to Information Retrieval.
- Natural Language Processing with Python – Analyzing Text with the Natural Language Toolkit
Weekly Schedule
Week 1 | Introduction and how to get and represent text |
Week 2 | Classification models |
Week 3 | Clustering and topical models |
Week 4 | Word embeddings and language models |
Week 5 | Sentiment analysis I |
Week 6 | Sentiment analysis II |
Week 7 | Text analytics for busisness: case studies |
Week 8 | Final presentation |