Scientist-led AI code audits

MergeProof Scientific Audit — Paired Run with Money-Back Promise

We treat every engagement like a clinical study: baseline your repository, help you apply the highest-leverage changes, then rerun the entire MergeProof stack under identical guardrails. If we can’t demonstrate a meaningful improvement in the agreed metrics after a good-faith remediation, we refund the fee.

Led by an NIH- and AHA-funded MD-PhD trainee with 15+ years designing peer-reviewed experiments. MergeProof spans autonomous agents, human analysis, and neural-style pruning of remediation variants so that only the safest, highest-signal paths move forward.

What the Paired Run covers

  • • Baseline MergeProof run with full findings + metrics
  • • Prioritized remediation plan (HYP/VAR IDs, success criteria)
  • • Post-fix MergeProof run and pre/post comparison
  • • Detailed experiment log for manuscripts and future audits
  • • Money-back promise if metrics fail to improve under the agreed criteria

Pick the path that fits your repo

Both paths feed the same scientific experiment log (HYP/VAR IDs, blinded repo keys, guardrail status). Choose based on your data sensitivity.

🌐

OSS rapid audit (self-serve)

Submit any public GitHub/GitLab/Bitbucket repo. Ideal for open-source projects and internal experiments. No auth required—just paste the URL and we log the run as an experiment.

  • ⏱️ Turnaround: minutes to hours
  • 📊 Output: risk map + experiment record
  • 🧪 Perfect for dataset building & manuscripts
🔐

Private Pilot (GitHub App)

Connect via the AiCan Orchestrator GitHub App or a passkey-protected webhook. We clone privately, run the audit in an isolated workspace, and log only blinded identifiers + metrics.

  • 🔑 Requires customer-specific passkey
  • 🧾 NDA + data handling doc provided
  • 📚 Designed for manuscript-grade evidence on proprietary repos

Need access? Request a Private Pilot passkey at [email protected] and we’ll provision the GitHub App install + intake credentials.

Why MergeProof exists

Scanners drown you in alerts. Auto-fixers can silently break prod. MergeProof adds the missing scientific layer: we frame improvements as hypotheses, test them under guardrails, and keep only what works.

How MergeProof is different

We are not another linter. We are the scientific layer above your existing tools.

👩‍🔬

Scientist-in-the-loop

Every audit is reviewed by a physician-scientist in training with NIH/AHA-funded research. We borrow the discipline of peer-reviewed studies.

🧪

Experiment-first

We log HYP/VAR IDs, expected metrics, and measurement plans. You get a roadmap of what to try next—and how to know if it worked.

🤖

Human-reviewed agents

Autonomous agents explore your repo, CI history, and tests. Humans triage and curate every recommendation before it reaches you.

🛡️

Regression-aware guardrails

We refuse to promote variants that increase test failures, CI flake, or critical issues. Guardrail status is logged for every run so regressions are caught before they merge.

15+ yrs
Scientific research experience
0
Guardrail regressions allowed
HYP / VAR
Experiment IDs logged per run
AI + Human
Every deliverable reviewed

What you get

Every engagement ships with tangible artifacts you can act on immediately.

🗺️

Repository risk map

Severity-ranked findings covering security, reliability, tests, CI stability, and observability. Each includes rationale and traceability.

🧭

Experiment-ready plan

Hypotheses (HYP-IDs), variants (VAR-IDs), expected metrics, and success criteria. Think of it as a ready-made backlog for safe improvements.

🛠️

Patch-level suggestions

Human-reviewed AI diffs, CI/test hardening steps, and configuration fixes you can port into PRs.

MergeProof readiness rating

Clear status: Not ready, Experimental ("shadow") gate, or Ready for staged gate—plus next steps to advance.

How a MergeProof audit runs

A deliberate blend of humans, autonomous agents, and the AiCan Dev Scientific Method.

  1. Intake & scoping

    You submit a repo, branch, and context. For private repos we use short-lived Git tokens over secure channels.

  2. Baseline analysis

    We clone into an isolated environment, inspect structure, tests, CI, and historic incidents to establish baseline metrics.

  3. Agent exploration

    Autonomous agents (Claude + companions) mine the repo, PR history, and CI logs to surface risks and propose candidate fixes. Every hypothesis is logged (HYP/VAR IDs).

  4. Scientist review & pruning

    A human reviewer validates findings, enforces guardrails, and prunes weak variants so only defensible recommendations survive.

  5. Report & follow-up

    You receive the report, experiment plan, and patches. We optionally walk you through the findings and how to integrate them.

AiCan Dev Scientific Method

Rapid experimentation, zero-regression guardrails

Every run is logged as an experiment with blind IDs. Guardrail status, metrics, and pruner decisions are appended to both private and scientific registries so we can publish or audit every claim.

We generate hypotheses, test them at AI speed, keep what improves metrics, and prune the rest. Think of it like training a neural network—except every weight update is traceable, human-reviewed, and backed by evidence.

MergeProof Scientific Audit

One product, paired by design. $1,500 with a measurable-outcome money-back promise.

$1,500 Paired Run Audit

For repos up to ~50k LOC (or equivalent complexity). Includes one baseline, one post-fix run, and a written analysis from our MD-PhD-led team.

  • • Baseline MergeProof run + prioritized remediation plan
  • • Post-fix MergeProof run with identical guardrails
  • • Experiment log (HYP/VAR IDs, metrics, guardrail status) for manuscripts
  • • Optional 30–45 min review session or recorded walkthrough

Prerequisite: a working automated test suite (pytest, nose, etc.) so the AiCan Dev Scientific Method can measure before/after results. We’ll help you scope this during intake.

Money-back promise

If you apply a good-faith subset of the prioritized recommendations and the follow-up run fails to show a meaningful improvement in the agreed metrics (with no new regressions), we refund the fee.

“Why $1,500?”

Comparable professional code audits typically range from $1k–$5k for small projects and $5k–$15k for mid-sized codebases. MergeProof includes:

  • • Scientist-in-the-loop analysis led by an NIH/AHA-funded MD-PhD trainee
  • • Paired, publishable experiment design (baseline + remediation + post-fix)
  • • Neural-style pruning of remediation variants to avoid regressions
  • • Outcome-linked refund if our process fails to deliver measurable improvement

Built by a scientist who lives with your constraints

MergeProof is led by an NIH- and AHA-funded MD-PhD trainee who has spent over a decade designing clinical and computational studies, applying statistics under uncertainty, and translating messy data into decisions. We bring that same rigor to your codebase.

What MergeProof does not promise

  • • We do not guarantee bug-free code or “instant fixes.”
  • • We do not auto-merge or block PRs without your approval.
  • • We do not replace your unit tests, SAST/DAST, or QA.
  • • We do provide a sharper map of risk and a disciplined path to improvements.

FAQ

Is this a beta?

No. MergeProof v1 is a paid, human-in-the-loop audit. Autonomous agents assist internally, but every deliverable is human reviewed.

Will you see our code?

Yes. We clone into isolated environments, keep logs, and do not use your proprietary code to train public models.

Can you integrate directly into our CI?

For v1 we focus on audits and recommendations. We’ll advise how to wire findings into your CI/CD and work with your existing tools.

Do you support private repos?

Yes. We use standard Git provider access tokens, log access for auditing, and honor your security controls.

Experiment dashboard (beta)

Live metrics pulled from the MergeProof scientific registry. Every run is logged with blind IDs, guardrail status, and pruning decisions.

Active hypotheses (this week)
-
Total logged runs
-
Active variants
-
Pruning status
-
Guardrail outcomes
– pass
– fail
– pending
Improvement plans logged
-
Improvement patches logged
-

Submit a repository for audit

Use this form for OSS rapid audits (public URL) or Private Pilot submissions (GitHub App + passkey). Every run feeds the AiCan Dev Scientific Method log with blinded IDs and guardrail status.

🔬 Scientific Method Enabled
Every submission is logged as an experiment (HYP/VAR IDs, guardrail status). Underperforming agent variants are pruned, so you benefit from the best-performing strategies.
Double-blind logging Zero regression policy Rapid experimentation
GitHub, GitLab, or Bitbucket HTTPS URL
Default branch or branch to analyze
We use this to send status updates
Format: HYP-XXX (optional but recommended)
Variant name, e.g., baseline, variant-A
Use tags to group runs (optional)
Required for Private Pilot repos. Leave blank for public OSS audits.

Processing your repository... this can take a few minutes.

✅ Job submitted successfully
Run ID:
Job ID:
Blind ID:
Status: Queued for processing...

You’ll receive updates once guardrails, metrics, and report artifacts are ready.