Advertisers

Performance campaign quality scorecard

A practical scorecard for judging whether a CPA campaign is scaling on quality, not just volume, before you add budget.

KiwiWall · Jul 03, 2026 · 12 min read

Performance campaign quality scorecard in brief

Before you add budget, run this scorecard against your campaign each week. A campaign is scaling when it improves with postback integrity, conversion quality, and dispute control. If one of those is slipping, reduce spend before you increase it.

Use this workflow:

Confirm postback and attribution quality are stable.
Evaluate conversion quality by action quality, not just raw conversion count.
Measure cap discipline and quality drift by source.
Score each source on a unified 0–100 rubric.
Pause, tune, or scale source-level budget only if quality is improving or stable under the same controls.

Who this is for

Advertisers running CPA/CPL/CPI campaigns and planning weekly spend decisions.
Performance operators coordinating traffic, quality, and finance inputs.
Technical operators maintaining postback reliability, attribution consistency, and reconciliation.
Affiliate/managers comparing multiple publishers or traffic partners.

Definition

A performance campaign quality scorecard is a structured weekly review framework used to decide whether to scale, hold, or cut traffic after observing both outcomes and system behavior.

In KiwiWall terms, it should combine three source domains:

Monetization outcome: approved conversion and payout quality.
Traffic behavior: postback latency, duplicate handling, rejection pattern.
Operational control: what happens when quality drops (ownership, rollback ability, and alerting).

Think of it as a campaign health passport where every source gets a score, not a gut opinion.

Decision table

Situation	When to use this scorecard	When to use a simpler test-only approach	KPI to check
You run multiple publishers/geos	✅ Scale and govern before adding budget	⚠️ If 1 publisher × 1 geo	postback success %, reject reason mix
Spend already paused/low	✅ Rebuild baseline before ramping	⚠️ If you need only a one-off smoke test	latency p95, dedupe mismatch rate
Compliance risk matters (fraud, chargebacks)	✅ Always	❌ Avoid blind scaling	invalid-traffic events, dispute count
Offer mix changes daily	✅ Must standardize by source and campaign IDs	⚠️ Use short experiment checklists only	source quality trend by week
You compare offerwall + direct affiliate offers	✅ Required for apples-to-apples decisioning	⚠️ If campaigns are not related	conversion quality by source and objective
You have strict caps and finance guardrails	✅ Supports cap governance	⚠️ If caps are not yet configured	cap utilization and pacing accuracy

Use this page when your problem is “my campaign is up but the quality is inconsistent.” If your question is only “which creative gets higher CR,” use a creative experiment runbook first.

How it works

Step 1: Define campaign surface and scoring window

You need stable definitions first. For the next 7 days, define:

campaign objective (signup, deposit, trial, purchase)
primary source dimension (publisher, offer_id, geo, placement)
evaluation window (recommend: rolling 7 days, reviewed weekly)
quality controls that are active (fraud checks, postback dedupe, cap by source)

Keep the window fixed until scorecards are stable for two consecutive weeks.

Step 2: Lock event definitions

Use the exact event fields in every postback:

source and campaign IDs
geo and device segment
user/session identifier strategy
transaction reference
status state (accepted, delayed, rejected, duplicate, disputed)

If IDs are inconsistent, postback matching becomes the top “unknown,” and the scorecard is guessing.

Step 3: Collect the required raw inputs

Each source/geo pairing needs these raw values:

Postback match rate
Approval to payout conversion
Postback retry/latency median and p95
Duplicate conversion rate
Invalid traffic / rejection classification
Payout exposure vs estimated fraud-adjusted return

Step 4: Score each source on a 0–100 rubric

Use this baseline:

30 points — Postback and attribution integrity
(match rate, dedupe, latency variance, missing callback recovery)
25 points — Approval and conversion quality
(approval ratio, return rate, objective quality by funnel stage)
25 points — Offer and traffic consistency
(rejection spread, volatility by day, cap health, pacing reliability)
20 points — Operational readiness
(ownership readiness, escalation path, rollback speed)

Formula is weighted and should be published before the week starts. Keep scoring consistent; changing weights mid-cycle creates false trend confidence.

Step 5: Convert scores into action

Start with these decision bands:

80+: Eligible for controlled scale (up to planned step-up amount).
60–79: Hold spend, tune one parameter at a time (e.g., geo mix or offer cap).
Below 60: Pause incrementally and root-cause before adding budget.

Never scale multiple variables at once. If you change traffic source mix, postback policy, and payout together, you cannot tell what worked.

Step 6: Attach a source-level playbook

For each source, record:

immediate action
owner
timeline
what “good” looks like after 48 hours

This is where teams avoid ambiguity. A scorecard without ownership becomes a static report.

Step 7: Track changes in a lightweight experiment log

Every change should have one line: date, source, change made, expected effect, decision band movement, and outcome check.

Example

An advertiser has three publisher sources and wants to scale a deposit objective campaign.

Source A: postback match 97%, rejection stable, strong conversion quality but high duplicate rate spikes after one UTC window.
Source B: high raw conversions but approval drops whenever traffic exceeds cap.
Source C: low volume, stable quality, excellent latency profile.

The scorecard calculates:

Source A: 78 (good base, needs dedupe fix)
Source B: 55 (volatile approval behavior, unstable quality)
Source C: 82 (lower volume but healthy controls)

Decision:

scale Source C 10% (controlled)
hold Source A and prioritize duplicate dedupe cleanup
pause Source B until approval stability is restored

Common mistakes

Using only conversion rate. Raw conversion hides payout-quality problems.
Ignoring postback failures. Healthy conversion with a broken callback path creates fake confidence.
Changing too many variables. A valid trend requires one major lever at a time.
Overweighting one week of history. Use rolling windows and compare against your own previous bands.
No hard rollback conditions. “Let it run longer” is not a strategy.

Checklist (minimum readiness)

Source list, offer set, and geo scope are defined for the scorecard window.
KPI definitions are documented and identical across teams.
Source-level postback integrity dashboard exists for all active sources.
Scoring weights are published and unchanged for the review period.
Duplicate and reject classifications are separated from true conversion failures.
Every score has owner, action, and review date.
Cap utilization is monitored against planned and max thresholds.
Latency and retry behavior are reviewed with p95, not only median.
Any source below 60 is placed in a recovery plan, not manually “ignored.”
Weekly close includes a decision note, not just a raw score update.

FAQ

Why does a source with strong raw volume still get a low score?

Because quality systems should value approval health and callback integrity over raw click/conversion count.

Can this framework work without full attribution parity across all sources?

It works, but with lower confidence. Set that as a known risk and downweight scaling decisions where parity is missing.

Is this only for CPA campaigns?

No. It works for CPL, CPI, and other outcome-based models, but you must adjust objective-specific quality definitions.

Should I include brand lift, creative effect, or landing-page variables?

Only if those variables move slowly and independently of source quality. Otherwise they confound the scorecard and distort scaling decisions.

How do I handle one-off spikes or drops?

Tag those days as outliers and require two-week confirmation before making directional budget changes.

When this applies

Use this page when you have:

at least two active traffic sources or one source with high internal variance,
KiwiWall postbacks with status granularity,
a finance expectation that requires quality and payout predictability,
and a team willing to tie spend decisions to operating controls.

When this does not apply

If you only run one tiny traffic source, one geo, and no meaningful postback visibility, start with a smaller experiment sheet first.

Peer references

Before/alongside this page, review:

Evidence notes

This article is tied to the current strategy-research pass and keyword validation run:

EVID-KS-20260618-1: Bing autocomplete source for affiliate offerwall.
EVID-KS-20260618-2: Bing autocomplete source for affiliate offers 360.
EVID-KS-20260618-3: YouTube autocomplete for offerwall tracking.
EVID-KS-20260618-4: Google Trends feed was tested; JS access is blocked in this runtime.
EVID-KS-20260618-5: Reddit search API returned HTTP 403 from this environment.
Search guidance used for structure: Google AI optimization guide and people-first quality documentation.

Conversion link

If your scorecard shows repeated quality issues before any spend increase, run a 7-day quality reset before expanding. If you’re ready to execute, initiate an advertiser campaign review with KiwiWall: Start with KiwiWall.

For broader campaign architecture, return to the parent hub: Advertiser Performance Scaling.

Next up

What to prepare before contacting KiwiWall