Comparison with other Python packages

This page positions eunoia among the other Python packages for set-membership diagrams. It has two halves: a qualitative map of the landscape, and a quantitative benchmark of the packages that actually solve the same problem eunoia does — fitting area-proportional Euler diagrams.

We deliberately leave out eulerr: it is eunoia’s sibling for R, built on the same Rust core, so comparing the two would measure nothing about the Python ecosystem.

The landscape

“Draw a Venn diagram” covers several genuinely different problems, and the Python packages split along those lines. Only the first group is directly comparable to eunoia.

Package

Area-proportional

Max sets

Shapes

Method

License

eunoia

yes

arbitrary

circle, ellipse, square, rectangle

numerical optimization (Rust core); reports residuals + goodness-of-fit

MIT

matplotlib-set-diagrams

yes

arbitrary

circles

optimization-based layout

GPL-3

matplotlib-venn

yes

3

circles

closed-form (2 sets); cost-based layout (3 sets)

MIT

BioVenn

yes

3

circles

area-proportional, with biological ID mapping

MIT

vennplot

yes

3

circles / balls (2D + 3D)

area-proportional

MIT

supervenn

yes (exact)

many

bars / chunks

splits sets into parts; not an Euler diagram

MIT

pyvenn / venn

no

6

fixed templates

static shapes; only labels move

MIT

matplotlib-subsets

no

nested rectangles

set hierarchy, not an Euler layout

MIT

eule

n/a

arbitrary

none

set algebra only — computes region sizes, draws nothing

MIT

Reading the table:

  • Genuine area-proportional fittersmatplotlib-set-diagrams and matplotlib-venn — are the only packages that solve eunoia’s problem, so they are the ones we benchmark below. BioVenn and vennplot are also area-proportional but capped at three circles, so they add little signal beyond matplotlib-venn.

  • A different representationsupervenn is exactly proportional but draws bar/chunk strips rather than overlapping shapes. It is a great tool, just not an Euler diagram, so it can’t be scored on the same geometry metric.

  • Not area-proportionalpyvenn/venn and matplotlib-subsets use fixed templates or nested rectangles; the picture does not encode the set sizes.

  • Complementary, not a competitoreule computes the disjoint region sizes from membership and draws nothing. It is the kind of preprocessing that feeds a fitter; eunoia does the same thing internally when you pass it membership lists (eu.euler({"A": [...], "B": [...]})) or a pandas/polars DataFrame used as a membership matrix.

Benchmark

Why the comparison is grouped by objective

Comparing area-proportional fitters is subtle because they do not all minimize the same thing, and scoring a fitter on an objective it never targeted is unfair. Each package minimizes a different loss:

Package / config

Minimizes

matplotlib-venn venn2

closed-form exact

matplotlib-venn venn3 (default)

Σ|log(1+fitted) − log(1+target)| (logarithmic L1)

matplotlib-set-diagrams

a selectable cost: "squared" (Σ(f−t)²), "simple" (Σ|f−t|), "logarithmic", "relative", "inverse"

eunoia

a selectable loss=: "sum_squared", "sum_absolute", "log_sum_absolute", "stress", "diag_error", …

So the only fair comparison is within an objective: pick a loss family, run the packages that can minimize it (configuring each to do so), and score them on that same loss. eunoia and matplotlib-set-diagrams are configurable, so they appear in several groups; matplotlib-venn is fixed, so it appears only in the logarithmic group its venn3 default defines.

Each group is scored on a scale-invariant version of its loss — a single multiplicative scale on the fitted areas is absorbed, because each package draws its diagram at an arbitrary size. For the squared family this scale-invariant score is exactly venneuler/eulerr stress; the absolute and logarithmic families use the analogous scale-invariant L1 and log-L1.

Of these packages, only eunoia reports any goodness-of-fit number itself; the harness re-measures every fitter identically (rasterizing the returned shapes) and validates that eunoia’s rasterized stress matches the value eunoia computes analytically (they agree to grid resolution).

The specifications are a curated subset of the eunoia Rust corpus (crates/eunoia/src/test_utils/corpus.rs), itself ported from eulerr’s reproducibility tests and real datasets from the eulerr issue tracker. They span 2 to 6 sets and include layouts circles provably cannot fit exactly, plus real biology and kinase data. The full harness lives in benchmarks/; reproduce with task benchmark.

Accuracy, grouped by objective

Sum of squared errors

Objective: minimize Σ (fitted - target)²; scored on stress (lower is better).

Case

Sets

eunoia (circle)

eunoia (ellipse)

matplotlib-set-diagrams

two_disjoint

2

0.0000

0.0000

0.0000

two_overlap

2

0.0000

0.0000

0.0082

three_set_small_overlaps

3

0.0002

0.0000

0.0002

uniform_3_set

3

0.0087

0.0000

0.1477

eulerape_3_set

3

0.0004

0.0000

0.0007

issue47_3_set_huge_triple

3

0.0245

0.0000

0.0281

issue111_3_set_asymmetric

3

0.0000

0.0000

0.0000

issue114_4_set_dominant_quad

4

0.0135

0.0000

0.1223

issue103_4_set

4

0.0041

0.0000

0.0097

issue93_5_set_kinases

5

0.0112

0.0000

0.0952

wilkinson_6_set

6

0.0042

0.0000

0.1012

Sum of absolute errors

Objective: minimize Σ |fitted - target|; scored on abs_error (lower is better).

Case

Sets

eunoia (circle)

eunoia (ellipse)

matplotlib-set-diagrams

two_disjoint

2

0.0000

0.0000

0.0000

two_overlap

2

0.0001

0.0001

0.0022

three_set_small_overlaps

3

0.0073

0.0048

0.0121

uniform_3_set

3

0.0650

0.0004

0.0862

eulerape_3_set

3

0.0116

0.0080

0.0289

issue47_3_set_huge_triple

3

0.0806

0.0044

0.1621

issue111_3_set_asymmetric

3

0.0003

0.0002

0.0024

issue114_4_set_dominant_quad

4

0.0961

0.0511

0.2477

issue103_4_set

4

0.0801

0.0541

0.1123

issue93_5_set_kinases

5

0.0979

0.0900

0.4890

wilkinson_6_set

6

0.0412

0.0364

0.3656

Logarithmic error

Objective: minimize Σ |log(1+fitted) - log(1+target)|; scored on log_error (lower is better).

Case

Sets

eunoia (circle)

eunoia (ellipse)

matplotlib-venn

matplotlib-set-diagrams

two_disjoint

2

0.0013

0.0013

0.0013

0.0013

two_overlap

2

0.0020

0.0020

0.0020

0.0021

three_set_small_overlaps

3

0.0163

0.0016

0.0197

0.0193

uniform_3_set

3

0.0644

0.0015

0.0653

0.0652

eulerape_3_set

3

0.0852

0.0024

0.1449

0.1078

issue47_3_set_huge_triple

3

0.1161

0.0109

0.1449

0.1452

issue111_3_set_asymmetric

3

0.0046

0.0004

0.0060

0.0057

issue114_4_set_dominant_quad

4

0.1544

0.0569

0.1679

issue103_4_set

4

0.1607

0.0577

0.2154

issue93_5_set_kinases

5

0.2214

0.1881

0.6291

wilkinson_6_set

6

0.0349

0.0241

0.3224

Grouped bar charts, one panel per objective, log scale.

Each panel is one objective; bars are the scale-invariant score for that objective (lower is better, log scale). Within a panel every fitter minimized the same loss, so the comparison is apples-to-apples. Bars are absent where a package cannot represent that set count.

Three things stand out:

  1. Matched on the same objective, eunoia’s circles beat the other circle fitters — in every group. In the squared-error group eunoia’s circles reach a lower stress than matplotlib-set-diagrams("squared") on every case, often by an order of magnitude on the harder specs; the absolute-error group tells the same story with matplotlib-set-diagrams("simple"). Most tellingly, in the logarithmic group — matplotlib-venn’s own default objective — eunoia’s circles beat both matplotlib-venn and matplotlib-set-diagrams("logarithmic") on every case (the two competitors are neck-and-neck with each other, as expected since they minimize the same thing). Given the same loss, eunoia’s optimizer simply lands closer.

  2. Ellipses then win outright. eunoia’s ellipses are the best fit in every group, on every case — reaching essentially zero error under the squared loss, and the lowest error by a wide margin under the absolute and logarithmic ones. This is geometry no circle-only package can match.

  3. matplotlib-venn is fixed and capped. It offers no choice of objective (its venn3 is a fixed logarithmic layout) and cannot draw four or more sets. matplotlib-set-diagrams scales to any set count and is configurable, but loses to eunoia within every shared objective.

eunoia only joined the logarithmic group because the core gained a "log_sum_absolute" loss in 1.1 (closing jolars/eunoia#96); choosing the objective to match the data is itself part of what eunoia offers here.

Wall-clock fit time

Accuracy is not the whole story; here is end-to-end fit time (one representative configuration per package).

Median end-to-end fit wall-clock time in milliseconds (median of 5 runs; lower is faster). Indicative only — timings are machine- and load-dependent.

Case

Sets

eunoia (circle)

eunoia (ellipse)

matplotlib-venn

matplotlib-set-diagrams

two_disjoint

2

3

8

5

9

two_overlap

2

11

14

7

340

three_set_small_overlaps

3

18

30

11

614

uniform_3_set

3

534

28

10

575

eulerape_3_set

3

23

75

11

197

issue47_3_set_huge_triple

3

486

40

11

726

issue111_3_set_asymmetric

3

44

54

11

714

issue114_4_set_dominant_quad

4

1107

197

29

issue103_4_set

4

917

1132

651

issue93_5_set_kinases

5

1312

3823

3056

wilkinson_6_set

6

2234

231

9742

Grouped bar chart of median fit time per case, log scale.

Median fit time per case (log scale), each package under the same configuration as the gallery. matplotlib-venn is fastest but capped at three sets; eunoia and matplotlib-set-diagrams are broadly comparable, both taking up to a few seconds on the hardest high-set specs. (Separately, eunoia’s non-smooth losses — "sum_absolute", "log_sum_absolute" — are markedly slower to optimize than the smooth "sum_squared" default shown here.)

Fitted layouts

Grid of fitted layouts, one column per fitter, one row per case.

Fitted layouts on representative corpus cases (eunoia under its default squared loss; set-diagrams under "squared"). The four-, five-, and six-set rows show matplotlib-venn dropping out, and eunoia’s ellipse column staying faithful where the circle columns visibly distort.

When to reach for what

  • eunoia when you want the most faithful diagram — especially with ellipses, with four or more sets, when you need to choose the objective (loss=) to suit your data, or when you need the residuals and goodness-of-fit numbers to judge whether the diagram can be trusted at all. MIT-licensed.

  • matplotlib-venn for a quick, dependency-light two- or three-circle Venn where exactness is not critical. Also MIT.

  • matplotlib-set-diagrams if you specifically want its word-cloud subset labels and are comfortable with GPL-3 — and be prepared to try its cost_function_objective options, since the default ("inverse") fits area-dominated diagrams poorly.

  • supervenn when exact proportionality matters more than the Euler-diagram shape, e.g. many sets with complex overlaps.

Reproducing

task benchmark
# or
uv sync --group benchmark
uv run --group benchmark python -m benchmarks.run

The competitor packages are confined to an isolated benchmark dependency group and are never part of the published eunoia wheel. We only run them to measure fit quality; matplotlib-set-diagrams is GPL-3, so no competitor source is vendored into or redistributed by eunoia.

Note

The numbers and figures on this page are committed to the repo and refreshed by running the benchmark; the documentation build itself does not install or execute the competitor packages.