# Performance Evaluation

| query                                            | timing (s) |
| ------------------------------------------------ | ---------- |
| `Call(Name("len"))`                              | 0.025985   |
| `BinOp(op=Add() \| Sub())`                       | 0.030508   |
| `Try(handlers=LEN(min=3, max=5))`                | 0.033486   |
| `BinOp(left=Constant(), right=Constant())`       | 0.146516   |
| `FunctionDef(f"run_%", returns = not None)`      | 0.0216     |
| `ClassDef(body=[Assign(), *..., FunctionDef()])` | 0.28737    |

## Analysis

There are 2 major points that cost nearly %95 of the whole query operation. The first,
and the obvious point is the actually running the query in the database. There are a couple
points that Reizc an do to optimize this step, including trying to generate the best
possible query while being in a linear motion (for supporting constructs like reference
variables). The code generator (`reiz.reizql.compiler`) went through a couple major
refactors for performance reasons (e.g [#12](https://github.com/reizio/reiz.io/pull/12)).
Also there is a simple/naive [AST optimization pass](https://github.com/reizio/reiz.io/blob/cff3cc6eaad532ac1a956c1f7c7a58d97ea00e4b/reiz/ir/backends/edgeql.py#L461-L513) on
the IR (EdgeQL) itself.

The second part is the actually retrieving the code snippets from the disk itself. We
already store a lot of metadata (like start/end positions, github project etc.) but
the actual 'source' is still on the disk. So after retrieving the filenames from the
query, we simply go and read those files and get the related segments. This is an area
that is open to more optimizations (we could statically determine the byte-range and
only fetch it, we could parallelize this for multiple matches \[the default resultset
come with 10 matches\], ...), though these won't have the same effects as in getting
a better speed in the DB.

Of course alongside these, there have been tons of ways to optimize postgresql itself
for different workloads, though it is outside of the Reiz project.

## Setup

Machine;

|              |                        |
| ------------ | ---------------------- |
| provider     | digital ocean          |
| service type | droplet (basic plan)   |
| cpu          | (shared) 2vCPU         |
| ram          | 2GB                    |
| disk         | regular SSD (not NVME) |

IndexDB;

|                 |            |
| --------------- | ---------- |
| total files     | 53k        |
| total AST nodes | 17 521 894 |

Benchmark script is present at the source checkout (`scripts/benchmark_doc.py`).