Text Classification

Thursday, July 20, 2017 - 18:00
London Machine Learning Study Group

Hi everyone,

I am really pleased to announce the next lecture from our Introduction to Machine Learning series.

In this session we will focus on text classification. We will talk about tokenizing, stemming words, and vectorizing documents. We will introduce metrics like term frequency (TF), inverse document frequency (IDF), and TFxIDF. We will talk about feature hashing and feature normalisation (L2-norm), and we show how to use Multinomial Naïve Bayes to perform document classification.

Please note: If you have missed any of the previous lectures in the series, there is a recording available at https://www.youtube.com/playlist?list=PLKryvmknjpgPLh3kS_t1_Z1DlmcYPlQhZ. Please spend some time to familiarise yourself with the topics from the previous lectures, and especially the Naïve Bayes session.

Just a reminder: The language of choice for the series is Python, so if you are not familiar with Python or if you need to brush up your skills I suggest you spend some time with the videos from the freely available Google's Python Class.

Please note, that OpenTable is our sponsor for this event and the meetup will take place at their office in Alphabeta Building, 14-18 Finsbury Square, London EC2A 1AH (Google maps directions). This is a great location, just a short walk from Moorgate Underground station .

Registration from 6pm, talks begin at 6:30pm

Alphabeta Building

14-18 Finsbury Square, EC2A 1AH