Quickstart

A two-set fit

The simplest case: two sets with one overlap.

import eunoia as eu

fit = eu.euler({"A": 10, "B": 5, "A&B": 3})
print(fit)
EulerFit (2 circles, diag_error=3.777e-13, stress=4.838e-25, loss=1.124e-24)
         original      fitted    residual regionError
  A            10          10  -4.597e-12   3.777e-13
  B             5           5  -6.758e-12    5.89e-14
  A&B           3           3  -9.156e-12   3.187e-13
fit.plot();
_images/0fcc5916d13c6609cdecb08fb5a01827ae70ad8008e227ecb8bef12fd634b65d.png

Inclusive input

By default, values are interpreted as exclusive per-region areas. If your numbers are total set sizes that include overlaps, pass input="inclusive" and the Eunoia core converts internally:

fit = eu.euler({"A": 13, "B": 8, "A&B": 3}, input="inclusive")
fit.original_values, fit.fitted_values
({'A': 13.0, 'B': 8.0, 'A&B': 3.0},
 {'A': 12.999999999999993, 'B': 7.999999999999996, 'A&B': 2.999999999999985})

Membership lists

Instead of region areas, you can pass each set its members. Every element is counted into the region of the sets it belongs to, giving exclusive per-region counts:

fit = eu.euler(
    {
        "A": ["x", "y", "z"],
        "B": ["y", "z", "w"],
        "C": ["z", "w", "q"],
    }
)
fit.original_values
{'A': 1.0, 'A&B': 1.0, 'A&B&C': 1.0, 'B&C': 1.0, 'C': 1.0}

Elements are deduplicated within a set and stringified, so sets, tuples and non-string labels all work. venn() accepts the same shape (it only needs the set names):

eu.venn({"A": ["x", "y"], "B": ["y", "z"]}).plot();
_images/8b053fdc0b0cccdf951234666d6e46c0bed3c35fb6a58767d48522967793468d.png

DataFrames

A pandas or polars DataFrame (anything narwhals supports) is read as a membership matrix: each column is a set, each row an observation, and a truthy cell means that observation belongs to the set. Columns must be boolean or 0/1 numeric:

import pandas as pd

df = pd.DataFrame(
    {
        "A": [1, 1, 0, 1, 0],
        "B": [0, 1, 1, 1, 0],
        "C": [0, 0, 1, 1, 1],
    }
)
eu.euler(df).original_values
{'C': 1.0, 'B&C': 1.0, 'A': 1.0, 'A&B': 1.0, 'A&B&C': 1.0}

Rows that belong to no set are dropped, and venn(df) takes the column names as the set names. The same works for polars frames.

Three sets with ellipses

Ellipses are more flexible than circles and can fit many three-set arrangements exactly:

fit = eu.euler(
    {"A": 2, "B": 2, "C": 2, "A&B": 1, "A&C": 1, "B&C": 1},
    shape="ellipse",
)
print(f"diag_error = {fit.diag_error:.3g}")
fit.plot(quantities="fitted");
diag_error = 1.1e-12
_images/f05f07f8a5214bb2a51d9231761f01f299222d44bd8e288fba0bec88cca94198.png

Custom styling

fit = eu.euler({"A": 10, "B": 7, "C": 8, "A&B": 3, "A&C": 4, "B&C": 2, "A&B&C": 1})
fit.plot(
    colors=["#e41a1c", "#377eb8", "#4daf4a"],
    quantities=True,
    edges={"linewidth": 1.5},
);
_images/74491d978fc1b872d12b4b307465fc6700382fa1cb998aad42fade2d9a8873ad.png

Math text in labels

Set names are drawn as matplotlib text, so anything between $…$ is rendered with its mathtext engine. Use Greek letters, subscripts, or full TeX as set names and they carry through to the labels and legend:

fit = eu.euler(
    {
        r"$\alpha$": 10,
        r"$\beta$": 7,
        r"$\gamma$": 8,
        r"$\alpha$&$\beta$": 3,
        r"$\alpha$&$\gamma$": 4,
        r"$\beta$&$\gamma$": 2,
        r"$\alpha$&$\beta$&$\gamma$": 1,
    }
)
fit.plot();
_images/b91d05fbc202efeca1d5d8e66bd3b9744ff5416faaa1957fd5e716de6052ecf7.png

Reproducibility

Pass a seed to fix the optimizer’s RNG:

fit_a = eu.euler({"A": 10, "B": 5, "A&B": 3}, seed=42)
fit_b = eu.euler({"A": 10, "B": 5, "A&B": 3}, seed=42)
fit_a.diag_error == fit_b.diag_error
True