Papers clustered by topics

Topics

Tips

  1. Blue bars represent the 30 most distinctive terms for the selection.
  2. You can browse the most distinctive terms: you will rapidly notice that terms are domain specific (chemistry, biology, medecine, anthropology, so on and so forth.).
  3. Double-tap/click on a cluster to explore it.
  4. The horizontal list of document is dynamically updated when a cluster is selected.
  5. Adjust the quantity of cluster (10 by default) with the drop down menu.

What’s behind ?

The topics graph shows a distribution of the content according to the most salient terms. This means that the algorithm compares each words in the text space to create clusters of specific words. By doing so, we can determine the most uncommon but also recurrent words to classify all the content.

Mathematically speaking, a unique vector in a multi-dimensional space is calculated for each words and then compare to the other. All words that share similar vectors are then grouped in clusters. In order to represent a multi-dimensional space in two dimensions, we use a technique that projects the best distribution possible. The X and Y axis in this graph does not represent a value but rather a scale of proximity.