### Definition

In premDAT, the similarity is a measure of how close two characters are. It ranges from 1 to 100 and currently relies on two attributes of the data:

1. The dimensions
2. The tags

Each pair of characters is compared through each dimension. Similar score (or level) in a dimension brings more points, but the importance of that dimension varies. When a dimension reach 1 or 5 for at least one character, it is considered important - a defining attribute of that character. If both scores match for a defining attribute, the similarity is level will likely be high. Conversely, high differences on defining attributes will severely decrease similarity. Matching tags also bring a few more points to the final computation, while tags present for a character and not for the other slightly diminishes the final similarity score.

### Concrete example (and values)

Consider the following abstract example with two characters (A and B), three tags (I, II, III) and five dimensions.

A has the following dimensions:

• Dimension 1 : 4
• Dimension 2 : 2
• Dimension 3 : 3
• Dimension 4 : 5
• Dimension 5 : 1
• Tags: I

B has the following dimensions:

• Dimension 1 : 1
• Dimension 2 : 2
• Dimension 3 : 5
• Dimension 4 : 4
• Dimension 5 : 3
• Tags: I, III

Here is how the similarity is computed:

• Dimension 1: the importance of the dimension is 10, because B has a score of 1. The similarity score for the dimension is 4 - `Math.abs(Dimension 1 for A - Dimension 1 for B)`, which results in 1. Thus, the dimension's score is 10 for similarity out of a total of 40 (`10/40`).
• Dimension 2: importance of 5, similarity of 4. Score is `20/20`.
• Dimension 3: importance of 10, similarity of 2. Score is `20/40`.
• Dimension 4: importance of 10, similarity of 3. Score is `30/40`.
• Dimension 5: importance of 10, similarity of 2. Score is `20/40`.
• Tags: each tag adds 3 points to the total. Tags that match between two characters add 3 points to the similarity score. A and B both have tag I, but only B has tag III. Thus, the total is increased of 6 points while the similarity score is increased of 3 (`3/6`).

The final result is computed by `similarity / total * 100`. Thus, the similarity between A and B is `93 / 146 * 100 ~= 69%` .