Comparison with other Python packages¶
This page positions eunoia among the other Python packages for set-membership diagrams. It has two halves: a qualitative map of the landscape, and a quantitative benchmark of the packages that actually solve the same problem eunoia does — fitting area-proportional Euler diagrams.
We deliberately leave out eulerr: it is eunoia’s sibling for R, built on the same Rust core, so comparing the two would measure nothing about the Python ecosystem.
The landscape¶
“Draw a Venn diagram” covers several genuinely different problems, and the Python packages split along those lines. Only the first group is directly comparable to eunoia.
Package |
Area-proportional |
Max sets |
Shapes |
Method |
License |
|---|---|---|---|---|---|
eunoia |
yes |
arbitrary |
circle, ellipse, square, rectangle |
numerical optimization (Rust core); reports residuals + goodness-of-fit |
MIT |
yes |
arbitrary |
circles |
optimization-based layout |
GPL-3 |
|
yes |
3 |
circles |
closed-form (2 sets); cost-based layout (3 sets) |
MIT |
|
yes |
3 |
circles |
area-proportional, with biological ID mapping |
MIT |
|
yes |
3 |
circles / balls (2D + 3D) |
area-proportional |
MIT |
|
yes (exact) |
many |
bars / chunks |
splits sets into parts; not an Euler diagram |
MIT |
|
no |
6 |
fixed templates |
static shapes; only labels move |
MIT |
|
no |
— |
nested rectangles |
set hierarchy, not an Euler layout |
MIT |
|
n/a |
arbitrary |
none |
set algebra only — computes region sizes, draws nothing |
MIT |
Reading the table:
Genuine area-proportional fitters —
matplotlib-set-diagramsandmatplotlib-venn— are the only packages that solve eunoia’s problem, so they are the ones we benchmark below.BioVennandvennplotare also area-proportional but capped at three circles, so they add little signal beyondmatplotlib-venn.A different representation —
supervennis exactly proportional but draws bar/chunk strips rather than overlapping shapes. It is a great tool, just not an Euler diagram, so it can’t be scored on the same geometry metric.Not area-proportional —
pyvenn/vennandmatplotlib-subsetsuse fixed templates or nested rectangles; the picture does not encode the set sizes.Complementary, not a competitor —
eulecomputes the disjoint region sizes from membership and draws nothing. It is the kind of preprocessing that feeds a fitter; eunoia does the same thing internally when you pass it membership lists (eu.euler({"A": [...], "B": [...]})) or a pandas/polars DataFrame used as a membership matrix.
Benchmark¶
Why the comparison is grouped by objective¶
Comparing area-proportional fitters is subtle because they do not all minimize the same thing, and scoring a fitter on an objective it never targeted is unfair. Each package minimizes a different loss:
Package / config |
Minimizes |
|---|---|
|
closed-form exact |
|
Σ|log(1+fitted) − log(1+target)| (logarithmic L1) |
|
a selectable cost: |
eunoia |
a selectable |
So the only fair comparison is within an objective: pick a loss family, run
the packages that can minimize it (configuring each to do so), and score them on
that same loss. eunoia and matplotlib-set-diagrams are configurable, so they
appear in several groups; matplotlib-venn is fixed, so it appears only in the
logarithmic group its venn3 default defines.
Each group is scored on a scale-invariant version of its loss — a single
multiplicative scale on the fitted areas is absorbed, because each package draws
its diagram at an arbitrary size. For the squared family this scale-invariant
score is exactly venneuler/eulerr stress; the absolute and logarithmic
families use the analogous scale-invariant L1 and log-L1.
Of these packages, only eunoia reports any goodness-of-fit number itself; the
harness re-measures every fitter identically (rasterizing the returned shapes)
and validates that eunoia’s rasterized stress matches the value eunoia
computes analytically (they agree to grid resolution).
The specifications are a curated subset of the eunoia Rust corpus
(crates/eunoia/src/test_utils/corpus.rs), itself ported from
eulerr’s reproducibility tests and real
datasets from the eulerr issue tracker. They span 2 to 6 sets and include
layouts circles provably cannot fit exactly, plus real biology and kinase data.
The full harness lives in
benchmarks/;
reproduce with task benchmark.
Accuracy, grouped by objective¶
Sum of squared errors¶
Objective: minimize Σ (fitted - target)²; scored on stress (lower is better).
Case |
Sets |
eunoia (circle) |
eunoia (ellipse) |
matplotlib-set-diagrams |
|---|---|---|---|---|
|
2 |
0.0000 |
0.0000 |
0.0000 |
|
2 |
0.0000 |
0.0000 |
0.0082 |
|
3 |
0.0002 |
0.0000 |
0.0002 |
|
3 |
0.0087 |
0.0000 |
0.1477 |
|
3 |
0.0004 |
0.0000 |
0.0007 |
|
3 |
0.0245 |
0.0000 |
0.0281 |
|
3 |
0.0000 |
0.0000 |
0.0000 |
|
4 |
0.0135 |
0.0000 |
0.1223 |
|
4 |
0.0041 |
0.0000 |
0.0097 |
|
5 |
0.0112 |
0.0000 |
0.0952 |
|
6 |
0.0042 |
0.0000 |
0.1012 |
Sum of absolute errors¶
Objective: minimize Σ |fitted - target|; scored on abs_error (lower is better).
Case |
Sets |
eunoia (circle) |
eunoia (ellipse) |
matplotlib-set-diagrams |
|---|---|---|---|---|
|
2 |
0.0000 |
0.0000 |
0.0000 |
|
2 |
0.0001 |
0.0001 |
0.0022 |
|
3 |
0.0073 |
0.0048 |
0.0121 |
|
3 |
0.0650 |
0.0004 |
0.0862 |
|
3 |
0.0116 |
0.0080 |
0.0289 |
|
3 |
0.0806 |
0.0044 |
0.1621 |
|
3 |
0.0003 |
0.0002 |
0.0024 |
|
4 |
0.0961 |
0.0511 |
0.2477 |
|
4 |
0.0801 |
0.0541 |
0.1123 |
|
5 |
0.0979 |
0.0900 |
0.4890 |
|
6 |
0.0412 |
0.0364 |
0.3656 |
Logarithmic error¶
Objective: minimize Σ |log(1+fitted) - log(1+target)|; scored on log_error (lower is better).
Case |
Sets |
eunoia (circle) |
eunoia (ellipse) |
matplotlib-venn |
matplotlib-set-diagrams |
|---|---|---|---|---|---|
|
2 |
0.0013 |
0.0013 |
0.0013 |
0.0013 |
|
2 |
0.0020 |
0.0020 |
0.0020 |
0.0021 |
|
3 |
0.0163 |
0.0016 |
0.0197 |
0.0193 |
|
3 |
0.0644 |
0.0015 |
0.0653 |
0.0652 |
|
3 |
0.0852 |
0.0024 |
0.1449 |
0.1078 |
|
3 |
0.1161 |
0.0109 |
0.1449 |
0.1452 |
|
3 |
0.0046 |
0.0004 |
0.0060 |
0.0057 |
|
4 |
0.1544 |
0.0569 |
— |
0.1679 |
|
4 |
0.1607 |
0.0577 |
— |
0.2154 |
|
5 |
0.2214 |
0.1881 |
— |
0.6291 |
|
6 |
0.0349 |
0.0241 |
— |
0.3224 |
Each panel is one objective; bars are the scale-invariant score for that objective (lower is better, log scale). Within a panel every fitter minimized the same loss, so the comparison is apples-to-apples. Bars are absent where a package cannot represent that set count.¶
Three things stand out:
Matched on the same objective, eunoia’s circles beat the other circle fitters — in every group. In the squared-error group eunoia’s circles reach a lower
stressthanmatplotlib-set-diagrams("squared")on every case, often by an order of magnitude on the harder specs; the absolute-error group tells the same story withmatplotlib-set-diagrams("simple"). Most tellingly, in the logarithmic group —matplotlib-venn’s own default objective — eunoia’s circles beat bothmatplotlib-vennandmatplotlib-set-diagrams("logarithmic")on every case (the two competitors are neck-and-neck with each other, as expected since they minimize the same thing). Given the same loss, eunoia’s optimizer simply lands closer.Ellipses then win outright. eunoia’s ellipses are the best fit in every group, on every case — reaching essentially zero error under the squared loss, and the lowest error by a wide margin under the absolute and logarithmic ones. This is geometry no circle-only package can match.
matplotlib-vennis fixed and capped. It offers no choice of objective (itsvenn3is a fixed logarithmic layout) and cannot draw four or more sets.matplotlib-set-diagramsscales to any set count and is configurable, but loses to eunoia within every shared objective.
eunoia only joined the logarithmic group because the core gained a
"log_sum_absolute" loss in 1.1 (closing
jolars/eunoia#96); choosing the
objective to match the data is itself part of what eunoia offers here.
Wall-clock fit time¶
Accuracy is not the whole story; here is end-to-end fit time (one
representative configuration per package).
Median end-to-end fit wall-clock time in milliseconds (median of 5 runs; lower is faster). Indicative only — timings are machine- and load-dependent.
Case |
Sets |
eunoia (circle) |
eunoia (ellipse) |
matplotlib-venn |
matplotlib-set-diagrams |
|---|---|---|---|---|---|
|
2 |
3 |
8 |
5 |
9 |
|
2 |
11 |
14 |
7 |
340 |
|
3 |
18 |
30 |
11 |
614 |
|
3 |
534 |
28 |
10 |
575 |
|
3 |
23 |
75 |
11 |
197 |
|
3 |
486 |
40 |
11 |
726 |
|
3 |
44 |
54 |
11 |
714 |
|
4 |
1107 |
197 |
— |
29 |
|
4 |
917 |
1132 |
— |
651 |
|
5 |
1312 |
3823 |
— |
3056 |
|
6 |
2234 |
231 |
— |
9742 |
Median fit time per case (log scale), each package under the same configuration
as the gallery. matplotlib-venn is fastest but capped at three sets; eunoia
and matplotlib-set-diagrams are broadly comparable, both taking up to a few
seconds on the hardest high-set specs. (Separately, eunoia’s non-smooth losses —
"sum_absolute", "log_sum_absolute" — are markedly slower to optimize than
the smooth "sum_squared" default shown here.)¶
Fitted layouts¶
Fitted layouts on representative corpus cases (eunoia under its default squared
loss; set-diagrams under "squared"). The four-, five-, and six-set rows show
matplotlib-venn dropping out, and eunoia’s ellipse column staying faithful
where the circle columns visibly distort.¶
When to reach for what¶
eunoia when you want the most faithful diagram — especially with ellipses, with four or more sets, when you need to choose the objective (
loss=) to suit your data, or when you need the residuals and goodness-of-fit numbers to judge whether the diagram can be trusted at all. MIT-licensed.matplotlib-venn for a quick, dependency-light two- or three-circle Venn where exactness is not critical. Also MIT.
matplotlib-set-diagrams if you specifically want its word-cloud subset labels and are comfortable with GPL-3 — and be prepared to try its
cost_function_objectiveoptions, since the default ("inverse") fits area-dominated diagrams poorly.supervenn when exact proportionality matters more than the Euler-diagram shape, e.g. many sets with complex overlaps.
Reproducing¶
task benchmark
# or
uv sync --group benchmark
uv run --group benchmark python -m benchmarks.run
The competitor packages are confined to an isolated benchmark dependency
group and are never part of the published eunoia wheel. We only run them to
measure fit quality; matplotlib-set-diagrams is GPL-3, so no competitor source
is vendored into or redistributed by eunoia.
Note
The numbers and figures on this page are committed to the repo and refreshed by running the benchmark; the documentation build itself does not install or execute the competitor packages.