The 5 KPIs Your APS Should Improve in 90 Days
If you are evaluating an Advanced Planning and Scheduling (APS) system, you have probably already read about OEE, OTIF, and lead time. Those are the metrics that show up in every vendor deck, every analyst report, every LinkedIn carousel about manufacturing excellence.
They are also the wrong place to start.
Not because they are unimportant. They matter a great deal to the business. But they are lagging, aggregated, and slow to move. OEE blends availability, performance, and quality into one number that can stay flat for a quarter even while your scheduling has gotten dramatically better or worse underneath it. OTIF depends on sales commitments, supplier reliability, and a dozen factors an APS does not control. If you set up your 90 day evaluation around these corporate KPIs, you will not see movement in time to draw a conclusion, and you risk writing off a tool that is actually working.
There is a more honest way to evaluate an APS: look at the metrics it directly optimizes. These are the objective functions that scheduling engines are built to minimize or maximize. When the engine is solving a real problem with real constraints, these numbers move fast, often within the first scheduling cycles. They are also the metrics that, over time, become the engine room behind the lagging indicators everyone reports to the board.
Here are the five that matter most, what realistic 90 day movement looks like, and why each one is a leading signal for the business metrics you ultimately care about.
1. Tardiness
Tardiness sums the delay across every job in the schedule. Every job is part of the calculation, but a job that ships on time or early contributes a value of zero, never a negative number. There is no credit for finishing early. A job that ships five days late contributes exactly five days to the total, regardless of how early everything else ran.
This means the metric only ever accumulates from the late side. Ten jobs finishing a week early do not offset one job finishing a day late. The total tardiness for the schedule is simply the sum of all these individual contributions, most of which are zero, with the late jobs carrying the entire weight of the number.
This distinction matters because it is the inverse of how most planners instinctively manage a schedule. Under pressure, it is natural to chase whichever job is loudest right now, which often means an already-early job gets prioritized further while a quietly slipping job falls further behind. A scheduling engine that explicitly minimizes total tardiness will not make that mistake. It will sequence work so that the jobs at risk of lateness get protected, even if that means a buffer job runs a little later than it strictly needs to.
What to expect in 90 days: a measurable drop in the number of jobs crossing their due date, and a smaller average delay on the ones that still do. This is usually one of the fastest-moving numbers because it responds directly to sequencing logic, not to upstream demand volatility.
Why it matters beyond scheduling: tardiness is the direct mechanical ancestor of OTIF. You cannot improve on-time-in-full at the customer level without first reducing tardiness at the job level. Watching tardiness in week one gives you an early read on whether OTIF will follow in month three.
2. Makespan
Makespan is the elapsed time between the start of the first task and the end of the last task in a schedule, batch, or campaign. It answers a very specific question: given this set of jobs, how long does it take to get through all of them, start to finish?
Makespan is the metric to watch when you are evaluating a defined batch of work rather than a continuous flow, for example a weekly production run, a campaign in a process plant, or a finite set of orders that all need to clear before a changeover to a different product family. A shorter makespan for the same set of jobs means the same work got done in less elapsed time, which frees up the back end of the schedule for more work, maintenance, or a buffer against the unexpected.
What to expect in 90 days: a shorter total duration for comparable batches or campaigns, typically most visible in the first one or two scheduling cycles after go-live, since this metric tends to respond quickly once setup times and sequencing improve.
Why it matters beyond scheduling: a tighter makespan directly shortens the lead time customers experience for that batch, and it is one of the more intuitive numbers to explain to people outside the scheduling function. "The same job mix now finishes six hours earlier" is a sentence everyone in the building understands.
3. Setup Time
Setup time sums the duration of sequence-dependent setups, cleanups, and changeovers across the schedule. The word "sequence-dependent" is the key detail. The cost of a changeover is not fixed. It depends entirely on what ran before it. Switching from a light color to a dark color on a paint line costs almost nothing. Switching from dark to light might mean a full cleaning cycle.
Most ERP and basic planning tools treat changeover time as a flat average per job. An APS engine that minimizes sequence-dependent setup time looks at the entire sequence and asks: in what order should these jobs run so that the total changeover burden is as low as possible? Two schedules with the exact same set of jobs can differ by hours of non-productive time purely based on sequencing.
What to expect in 90 days: a visible reduction in total changeover hours per week or per shift, often one of the largest and most immediately credible wins because the math is easy to verify on the shop floor. Plant managers can usually feel this one before the report confirms it.
Why it matters beyond scheduling: every hour not spent on a changeover is an hour available for production. This is the most direct lever on capacity you have without buying equipment, and it feeds straight into the next two metrics.
4. Throughput
Throughput maximizes the total quantity of all products produced. It is a simple, almost brutal metric: more output, full stop, regardless of mix.
This is the metric to watch when the constraint is demand, not capacity. If you have more orders than you can fill, an engine maximizing throughput will sequence and allocate resources to squeeze the maximum total volume out of the plant, which is generally what the business wants in that situation, but it is worth confirming with your planning team that raw volume, rather than a specific product mix or margin profile, is the actual goal for the period being measured.
What to expect in 90 days: an increase in total units produced per week without additional labor or equipment, driven mostly by recovered time from reduced setups and fewer scheduling conflicts.
Why it matters beyond scheduling: throughput is the rawest measure of capacity utilization. If you are quoting lead times or capacity commitments to customers, this is the number underneath those promises.
5. Throughput Rate
Throughput rate maximizes the quantity of all products produced per unit of time. The distinction from plain throughput matters more than it looks. A plant can increase total throughput simply by running more hours. Throughput rate strips out that variable and asks how much you are producing per hour of available time, which is a much cleaner signal of scheduling efficiency rather than just longer shifts.
What to expect in 90 days: a steadier, often smaller improvement than the raw throughput number, but a more trustworthy one. If throughput rate is climbing alongside throughput, you know the gain is coming from better sequencing and resource use rather than simply running the plant longer.
Why it matters beyond scheduling: this is close to a real-time proxy for OEE's performance component, without waiting for a full OEE calculation cycle. It tells you, in near real time, whether the schedule itself is getting more efficient.
How These Five Connect to the KPIs Your Board Actually Asks About
None of these five metrics are vanity numbers. They are upstream causes of the downstream KPIs everyone already tracks.
This is the honest way to frame a 90 day pilot. Do not promise a board-level KPI will visibly move in three months, because most of them are lagging and noisy by design. Instead, agree upfront on which of these five objective functions are most relevant to your operation, baseline them before go-live, and track them weekly. If they move in the right direction, the lagging KPIs will follow on their own timeline, and you will have the evidence to show exactly why.
Setting a Realistic 90 Day Benchmark
A pilot evaluation works best with a simple structure:
- Weeks 1 to 2: baseline the current schedule's tardiness, setup time, throughput, throughput rate, and makespan using historical data, before any optimization is applied.
- Weeks 3 to 8: run the APS engine in parallel with current planning, comparing the engine's proposed schedules against what actually happened on the floor.
- Weeks 9 to 12: move to live execution on at least one production line or product family, and track the same five metrics week over week.
The goal is not a single before-and-after snapshot. Scheduling problems are dynamic, demand shifts, machines go down, rush orders arrive, and a single comparison can be misleading in either direction. A 12 week trend line on these five metrics gives a far more reliable picture than any single week, and it gives you a defensible answer when someone asks what the new system actually changed.
FAQ
1. Why shouldn't I just track OEE and OTIF during my APS pilot?
You can track them, but do not expect them to move much in 90 days. OEE and OTIF are lagging indicators that blend multiple factors, some of which an APS does not control, such as supplier delays or sales commitments. They are also slow to react because they are usually calculated as rolling averages over weeks or months. Tardiness, setup time, throughput, throughput rate, and makespan respond directly to scheduling changes, often within the first few cycles, which makes them far more useful for judging a short pilot.
2. Which of the five KPIs should I prioritize if I can only watch one or two?
It depends on your constraint. If late deliveries are your main pain point, watch tardiness first, since it is the direct precursor to OTIF. If your plant runs many product changeovers, setup time will usually show the fastest and most visible improvement. If your bottleneck is overall capacity rather than lateness, throughput and throughput rate are the better signals.
3. How quickly should I expect to see results after go-live?
Setup time and tardiness tend to move fastest, often visible within the first one or two scheduling cycles, because they respond directly to sequencing logic. Throughput and throughput rate usually take a few more cycles to stabilize, since they depend on sustained scheduling discipline rather than a single optimized sequence. Makespan improvements are typically visible as soon as a comparable batch or campaign runs through the new schedule.
4. Do these five KPIs replace OEE, OTIF, and lead time, or work alongside them?
They work alongside them, not instead of them. OEE, OTIF, and lead time remain the metrics you report to customers and leadership. The five APS-level KPIs are the leading indicators that explain why those numbers are moving, and they give you an early read during a pilot, long before the lagging KPIs have had time to catch up.
5. What does a realistic improvement look like across these five metrics in 90 days?
It varies by plant and by how far the current schedule is from optimal, so treat any fixed percentage with caution. What is consistent is the order in which improvements appear: setup time and tardiness typically improve first, throughput and throughput rate follow as the schedule stabilizes, and makespan improvements become visible as soon as a comparable batch runs through the new process. A 12 week trend line, rather than a single before and after snapshot, is the most reliable way to confirm genuine improvement.
Want to see how MangoGem APS Optimizer handles the specific scheduling constraints of your fabrication shop?