Project - Artificial Intelligence - ثالث ثانوي
Part 1
1. Basics of Artificial Intelligence
2. Artificial Intelligence Algorithms
3. Natural Language Processing (NPL)
Part 2
4. Image Recognition
5. Optimization & Decision-making Algorithms
Project Text classification is a 2-step process that includes: Step 1: Using a set of training documents with known labels (classes) to train a classification model. Step 2: Using the trained model to predict the label for each document in a testing set. The labels in the testing set are either unknown or hidden and used later for verification. The documents in both the training and testing sets have to be vectorized before they can be used. The CountVectorizer or TfidfVectorizer tools from the sklearn library can be used for vectorization. The Python sklearn library offers a long list of classification models. Some of them are: > GradientBoosting Classifier() > Decision Tree Classifier() > RandomForestClassifier() Your task is to use the IMDB training set that was used in this lesson to train a model that achieves the highest possible accuracy on the IMDB testing set (imdb_data/ imdb_test.csv). You can achieve this by: 1 Replacing the Multinomial NB classifier with other classification models from sklearn, such as the ones listed above. 2 3 وزارة التعليم Ministry of Education 192 2024-1446 Re-running your notebook after each replacement, to compute the accuracy of each new model that you try. Creating a report that compares the accuracy of all the models that you tried and identifies the one that achieved the best accuracy.
Project
Wrap up Now you have learned to: > Classify text with unsupervised learning models. > Analyze text with supervised learning models. > Use Machine Learning models for NLG. > Program a simple chatbot. KEY TERMS وزارة التعليم Ministry of Education 2024-1446 Black-Box predictors Chatbot Cluster Dendrogram Dimensionality Reduction Document Clustering Natural Language Supervised Learning Syntax Analysis Generation Natural Language Processing Part of Speech (POS) Tags Sentiment analysis Tokenization Transfer Learning Unsupervised Learning Vectorization 193