Notebook Three | Report | Repository

Tags vectorisation

Andrea Leone
University of Trento
January 2022

Analyse the tag distribution in the database: extract all tags of each talk and store them in a set

Get the frequency of each tag in the set

Plot the tag frequency distribution

Tags are manifold: select a bunch and check the frequency

Create a dictioary with three main categories, each one collecting the tags that describe or concern it.

Assign each record to one of the three categories

Explore the data distribution according to the three categories.

Data seems quite balanced. Now, load and compress the vectors.

Arrange the data for a 3D-scatterplot and see the result