This poster will present the pilot project of using classification method, a machine learning (ML) technique, to categorize the vast amount of data of the library’s virtual reference transcripts. Two classification models, random forest model and gradient boosting model, were explored to classify the chat questions into two categories: reference questions and non-reference questions. The purpose of the project is to explore the possibility of using the classification model to automatically categorize the future new questions so that the library may better use its resources and serve the library users more effectively and efficiently. General steps of ML and snippets of python code will also be presented.