Services

Resources

Company

Jan 29, 2026 | 4 min read

Why your Architecture should start with Questions, not boxes

Why your Architecture should start with Questions, not boxes

Jan 29, 2026 | 4 min read

Why your Architecture should start with Questions, not boxes

Jan 29, 2026 | 4 min read

Why your Architecture should start with Questions, not boxes

Jan 29, 2026 | 4 min read

Why your Architecture should start with Questions, not boxes

"How do I know I am growing in my career?"

An engineer asked me this in a 1:1 recently. I have heard it many times, usually from people who feel they are plateauing at a “Senior” level.

My answer is simple. I use a small test for myself. I look at the design or code I wrote six months ago. If I can see ways to improve it, I know I am growing. If I look at it and cannot find a single thing to make better, I know I have stagnated. I need to learn more.

You can replace “code” with anything: design, architecture, writing, a blog post, or even a business strategy. If you cannot look back and see the flaws in your past work, you have not moved forward. To grow, you have to work on your craft. Craft requires a feedback loop.

Good architecture is no different. If you review an architecture document from six months ago and you cannot see missing questions about failure modes, limits, or tradeoffs, you have probably stopped growing as an architect.

This is why your architecture should start with questions, not boxes.

The Curiosity vs Performance Trap

In my experience, the key differentiator between people I enjoy working with and people I do not is who asks questions out of curiosity.

Engineers and designers who ask questions from a curious lens instead of a performance lens are some of the best people to work with. Curious questions sound like:

  • Help me understand why we chose this approach?

  • What problem were we originally trying to solve here?

  • What happens if we do not do this at all?

Performance questions, on the other hand, sound like:

  • Why did nobody think of this obvious solution?

  • Should we not be doing X instead?

  • Who decided this was a good idea?

You are asking about the same technical thing, but with a completely different intent.

Curious people make teams smarter because they make assumptions visible. They allow the team to see things in a different way and make it safe to say, “I do not know yet.”

Performance driven questions make teams defensive. Even when they are right, they train the team to stay quiet the next time to avoid being put on the spot. They turn technical discussions into a blame game.

If you want to build a culture of “Peace of Mind Engineering”, you have to protect the curious and keep the performance theater out of the room.

You also have to channel that curiosity into the right questions, before anyone starts drawing boxes.

Questions to ask before drawing boxes

Before you open a diagramming tool, ask questions like:

  • What user or business pain are we actually trying to reduce?

  • What happens if we do not build this at all?

  • What is the simplest version that delivers most of the value?

  • What can fail, and how will we know before the customer does?

  • Which constraints are real (SLOs, budgets, team skills) and which are self imposed?

  • Who will own this in twelve to eighteen months, and what are the painful maintenance tasks?

The quality of your architecture is limited by the quality of questions you ask at this stage. The boxes only come after.

You Cannot Learn to Swim Without Getting into the Water

Many engineers think they can just read system design blogs and newsletters to get better at architecture. They practice system design interviews on whiteboards and assume that is enough experience to scale systems for a large business.

It is not.

You cannot learn how to swim by watching videos. At some point, you have to get into the water.

Blogs and newsletters are useful for high level awareness. But if you spend years consuming them without owning systems in production, you will stagnate as an architect.

There is no replacement for the experience of building and scaling systems in production. When I interview candidates, many have a decade of experience on paper. Very few have stayed with the same system long enough to see the feedback on their technical decisions.

If you leave a project three months after launch, you never see the sharp edges.

You never see what happens to your elegant microservices when a downstream dependency starts timing out at the ninety ninth percentile. On a whiteboard, you just draw a retry logic. In production, that retry logic can trigger a retry storm that takes down your entire infrastructure.

Here is a simple example.

We once had a service that depended on a payments API. On paper, the design looked fine. We added retries with exponential backoff. The box was labeled “resilient”.

Then one day, the downstream started responding slowly but did not actually fail. Latency increased, but there were no hard errors. Our service was configured with aggressive timeouts and retries. Every slow request triggered more retries. The load on the downstream went up, which made it even slower. Within minutes, the entire chain melted down.

The architecture diagram was correct at the box level. What was missing were questions like:

  • What happens if this dependency is slow but not down?

  • How many retries are safe for the downstream?

  • At what point should we fail fast and surface a controlled error to the user?

You cannot learn that by solving a system design question in a sixty minute interview. You learn it when you sit with the system, watch it fail, and then update your mental model and your questions.

The "Lities" and the Reality of Production

In a whiteboard interview, the boxes always work. In production, the boxes are the easy part. The lines between them are what fail.

When you stay with a system in production, you start to care deeply about the “lities”:

  • Availability

  • Reliability

  • Maintainability

  • Scalability

You realise that a two terabyte database is not just a box. It is a living system you have to migrate, back up, and monitor. You learn the pain of trying to add a non null column with a default value to a massive table while doing thousands of writes per second, without a maintenance window.

Most of that pain comes from questions you did not ask early enough.

Availability raises questions like, “What does up really mean for this user?” and “Which failure modes are acceptable in which journey?”

Maintainability forces questions like, “How often will we need to change this schema?” and “Can a new engineer understand and safely modify this in an afternoon?”

Reliability adds, “What are the most likely partial failures, and how will this system degrade in those cases?”

Scalability is, “Where are our natural limits, what happens when we hit them, and what is the plan before we get there?”

You also encounter what I call the “Observability Paradox”.

Your system is failing. Dashboards are red. But the tool you use for monitoring is also lagging or failing because it cannot handle the volume of error logs and metrics.

You now have to answer, in real time, “Is the system actually down, or is the telemetry lying?”

You cannot learn that panic from a newsletter. You learn it when you are the one who has to make a call while the business is losing money.

On diagrams, boxes feel important. In production, the lines are where money is lost. Good architects learn to ask questions about the lines very early, instead of polishing the boxes.

Craft Over Titles

At One2N, our tagline is “Pragmatic Software Engineering, Released to Production”.

We built Prayogshala, our internal learning and R and D lab, to accelerate this kind of growth. It is a place where our engineers can get into the water before they touch a client project.

We do not run “Hello World” tutorials. We assign proof of concepts that reflect the constraints of the portfolios we work with, such as those of PeakXV or Accel.

An engineer might spend weeks on a proof of concept for OpenTelemetry or Victoria Metrics, not just to see if it works, but to see how it fails. Our Principal SRE, Saurabh Hirani (CHOTU), acts as the difficult customer. He pushes them to explain why they chose a specific branching strategy, what tradeoffs they are making on cost, and how their solution affects the bottom line.

This process turns curiosity into muscle memory. It ensures that when our team walks into a room, they are not just drawing boxes. They are bringing a craft that has been tested against the sharp edges of reality.

You do not need a formal lab to apply the same idea.

You can:

  • Run small proof of concepts that deliberately stress the edges of your chosen tools, such as migrations, failure drills, and load tests

  • Assign someone in every design review to play the difficult customer and only ask business and failure questions

  • Keep a log of “questions we wish we had asked earlier” after incidents and read it before starting a new architecture

Over time, this shifts your default from “How do I draw this nicely” to “What questions am I missing.”

Your influence as an engineer comes from your craft, not your title. Craft includes knowing which questions to ask before you write a single line of code. It includes the humility to realise that the code or design you wrote six months ago should probably be better today.

The best architects I know do not start with Kubernetes clusters and service meshes. They start with uncomfortable questions about what problem is real, what can fail, and what the team and the business are willing to live with.

If you are a curious engineer and want to work with a team that values production experience over whiteboard theater, have a look at our careers page. If you want to grow as an architect, start by changing the first thing you do.

Ask better questions. Draw the boxes later.

"How do I know I am growing in my career?"

An engineer asked me this in a 1:1 recently. I have heard it many times, usually from people who feel they are plateauing at a “Senior” level.

My answer is simple. I use a small test for myself. I look at the design or code I wrote six months ago. If I can see ways to improve it, I know I am growing. If I look at it and cannot find a single thing to make better, I know I have stagnated. I need to learn more.

You can replace “code” with anything: design, architecture, writing, a blog post, or even a business strategy. If you cannot look back and see the flaws in your past work, you have not moved forward. To grow, you have to work on your craft. Craft requires a feedback loop.

Good architecture is no different. If you review an architecture document from six months ago and you cannot see missing questions about failure modes, limits, or tradeoffs, you have probably stopped growing as an architect.

This is why your architecture should start with questions, not boxes.

The Curiosity vs Performance Trap

In my experience, the key differentiator between people I enjoy working with and people I do not is who asks questions out of curiosity.

Engineers and designers who ask questions from a curious lens instead of a performance lens are some of the best people to work with. Curious questions sound like:

  • Help me understand why we chose this approach?

  • What problem were we originally trying to solve here?

  • What happens if we do not do this at all?

Performance questions, on the other hand, sound like:

  • Why did nobody think of this obvious solution?

  • Should we not be doing X instead?

  • Who decided this was a good idea?

You are asking about the same technical thing, but with a completely different intent.

Curious people make teams smarter because they make assumptions visible. They allow the team to see things in a different way and make it safe to say, “I do not know yet.”

Performance driven questions make teams defensive. Even when they are right, they train the team to stay quiet the next time to avoid being put on the spot. They turn technical discussions into a blame game.

If you want to build a culture of “Peace of Mind Engineering”, you have to protect the curious and keep the performance theater out of the room.

You also have to channel that curiosity into the right questions, before anyone starts drawing boxes.

Questions to ask before drawing boxes

Before you open a diagramming tool, ask questions like:

  • What user or business pain are we actually trying to reduce?

  • What happens if we do not build this at all?

  • What is the simplest version that delivers most of the value?

  • What can fail, and how will we know before the customer does?

  • Which constraints are real (SLOs, budgets, team skills) and which are self imposed?

  • Who will own this in twelve to eighteen months, and what are the painful maintenance tasks?

The quality of your architecture is limited by the quality of questions you ask at this stage. The boxes only come after.

You Cannot Learn to Swim Without Getting into the Water

Many engineers think they can just read system design blogs and newsletters to get better at architecture. They practice system design interviews on whiteboards and assume that is enough experience to scale systems for a large business.

It is not.

You cannot learn how to swim by watching videos. At some point, you have to get into the water.

Blogs and newsletters are useful for high level awareness. But if you spend years consuming them without owning systems in production, you will stagnate as an architect.

There is no replacement for the experience of building and scaling systems in production. When I interview candidates, many have a decade of experience on paper. Very few have stayed with the same system long enough to see the feedback on their technical decisions.

If you leave a project three months after launch, you never see the sharp edges.

You never see what happens to your elegant microservices when a downstream dependency starts timing out at the ninety ninth percentile. On a whiteboard, you just draw a retry logic. In production, that retry logic can trigger a retry storm that takes down your entire infrastructure.

Here is a simple example.

We once had a service that depended on a payments API. On paper, the design looked fine. We added retries with exponential backoff. The box was labeled “resilient”.

Then one day, the downstream started responding slowly but did not actually fail. Latency increased, but there were no hard errors. Our service was configured with aggressive timeouts and retries. Every slow request triggered more retries. The load on the downstream went up, which made it even slower. Within minutes, the entire chain melted down.

The architecture diagram was correct at the box level. What was missing were questions like:

  • What happens if this dependency is slow but not down?

  • How many retries are safe for the downstream?

  • At what point should we fail fast and surface a controlled error to the user?

You cannot learn that by solving a system design question in a sixty minute interview. You learn it when you sit with the system, watch it fail, and then update your mental model and your questions.

The "Lities" and the Reality of Production

In a whiteboard interview, the boxes always work. In production, the boxes are the easy part. The lines between them are what fail.

When you stay with a system in production, you start to care deeply about the “lities”:

  • Availability

  • Reliability

  • Maintainability

  • Scalability

You realise that a two terabyte database is not just a box. It is a living system you have to migrate, back up, and monitor. You learn the pain of trying to add a non null column with a default value to a massive table while doing thousands of writes per second, without a maintenance window.

Most of that pain comes from questions you did not ask early enough.

Availability raises questions like, “What does up really mean for this user?” and “Which failure modes are acceptable in which journey?”

Maintainability forces questions like, “How often will we need to change this schema?” and “Can a new engineer understand and safely modify this in an afternoon?”

Reliability adds, “What are the most likely partial failures, and how will this system degrade in those cases?”

Scalability is, “Where are our natural limits, what happens when we hit them, and what is the plan before we get there?”

You also encounter what I call the “Observability Paradox”.

Your system is failing. Dashboards are red. But the tool you use for monitoring is also lagging or failing because it cannot handle the volume of error logs and metrics.

You now have to answer, in real time, “Is the system actually down, or is the telemetry lying?”

You cannot learn that panic from a newsletter. You learn it when you are the one who has to make a call while the business is losing money.

On diagrams, boxes feel important. In production, the lines are where money is lost. Good architects learn to ask questions about the lines very early, instead of polishing the boxes.

Craft Over Titles

At One2N, our tagline is “Pragmatic Software Engineering, Released to Production”.

We built Prayogshala, our internal learning and R and D lab, to accelerate this kind of growth. It is a place where our engineers can get into the water before they touch a client project.

We do not run “Hello World” tutorials. We assign proof of concepts that reflect the constraints of the portfolios we work with, such as those of PeakXV or Accel.

An engineer might spend weeks on a proof of concept for OpenTelemetry or Victoria Metrics, not just to see if it works, but to see how it fails. Our Principal SRE, Saurabh Hirani (CHOTU), acts as the difficult customer. He pushes them to explain why they chose a specific branching strategy, what tradeoffs they are making on cost, and how their solution affects the bottom line.

This process turns curiosity into muscle memory. It ensures that when our team walks into a room, they are not just drawing boxes. They are bringing a craft that has been tested against the sharp edges of reality.

You do not need a formal lab to apply the same idea.

You can:

  • Run small proof of concepts that deliberately stress the edges of your chosen tools, such as migrations, failure drills, and load tests

  • Assign someone in every design review to play the difficult customer and only ask business and failure questions

  • Keep a log of “questions we wish we had asked earlier” after incidents and read it before starting a new architecture

Over time, this shifts your default from “How do I draw this nicely” to “What questions am I missing.”

Your influence as an engineer comes from your craft, not your title. Craft includes knowing which questions to ask before you write a single line of code. It includes the humility to realise that the code or design you wrote six months ago should probably be better today.

The best architects I know do not start with Kubernetes clusters and service meshes. They start with uncomfortable questions about what problem is real, what can fail, and what the team and the business are willing to live with.

If you are a curious engineer and want to work with a team that values production experience over whiteboard theater, have a look at our careers page. If you want to grow as an architect, start by changing the first thing you do.

Ask better questions. Draw the boxes later.

"How do I know I am growing in my career?"

An engineer asked me this in a 1:1 recently. I have heard it many times, usually from people who feel they are plateauing at a “Senior” level.

My answer is simple. I use a small test for myself. I look at the design or code I wrote six months ago. If I can see ways to improve it, I know I am growing. If I look at it and cannot find a single thing to make better, I know I have stagnated. I need to learn more.

You can replace “code” with anything: design, architecture, writing, a blog post, or even a business strategy. If you cannot look back and see the flaws in your past work, you have not moved forward. To grow, you have to work on your craft. Craft requires a feedback loop.

Good architecture is no different. If you review an architecture document from six months ago and you cannot see missing questions about failure modes, limits, or tradeoffs, you have probably stopped growing as an architect.

This is why your architecture should start with questions, not boxes.

The Curiosity vs Performance Trap

In my experience, the key differentiator between people I enjoy working with and people I do not is who asks questions out of curiosity.

Engineers and designers who ask questions from a curious lens instead of a performance lens are some of the best people to work with. Curious questions sound like:

  • Help me understand why we chose this approach?

  • What problem were we originally trying to solve here?

  • What happens if we do not do this at all?

Performance questions, on the other hand, sound like:

  • Why did nobody think of this obvious solution?

  • Should we not be doing X instead?

  • Who decided this was a good idea?

You are asking about the same technical thing, but with a completely different intent.

Curious people make teams smarter because they make assumptions visible. They allow the team to see things in a different way and make it safe to say, “I do not know yet.”

Performance driven questions make teams defensive. Even when they are right, they train the team to stay quiet the next time to avoid being put on the spot. They turn technical discussions into a blame game.

If you want to build a culture of “Peace of Mind Engineering”, you have to protect the curious and keep the performance theater out of the room.

You also have to channel that curiosity into the right questions, before anyone starts drawing boxes.

Questions to ask before drawing boxes

Before you open a diagramming tool, ask questions like:

  • What user or business pain are we actually trying to reduce?

  • What happens if we do not build this at all?

  • What is the simplest version that delivers most of the value?

  • What can fail, and how will we know before the customer does?

  • Which constraints are real (SLOs, budgets, team skills) and which are self imposed?

  • Who will own this in twelve to eighteen months, and what are the painful maintenance tasks?

The quality of your architecture is limited by the quality of questions you ask at this stage. The boxes only come after.

You Cannot Learn to Swim Without Getting into the Water

Many engineers think they can just read system design blogs and newsletters to get better at architecture. They practice system design interviews on whiteboards and assume that is enough experience to scale systems for a large business.

It is not.

You cannot learn how to swim by watching videos. At some point, you have to get into the water.

Blogs and newsletters are useful for high level awareness. But if you spend years consuming them without owning systems in production, you will stagnate as an architect.

There is no replacement for the experience of building and scaling systems in production. When I interview candidates, many have a decade of experience on paper. Very few have stayed with the same system long enough to see the feedback on their technical decisions.

If you leave a project three months after launch, you never see the sharp edges.

You never see what happens to your elegant microservices when a downstream dependency starts timing out at the ninety ninth percentile. On a whiteboard, you just draw a retry logic. In production, that retry logic can trigger a retry storm that takes down your entire infrastructure.

Here is a simple example.

We once had a service that depended on a payments API. On paper, the design looked fine. We added retries with exponential backoff. The box was labeled “resilient”.

Then one day, the downstream started responding slowly but did not actually fail. Latency increased, but there were no hard errors. Our service was configured with aggressive timeouts and retries. Every slow request triggered more retries. The load on the downstream went up, which made it even slower. Within minutes, the entire chain melted down.

The architecture diagram was correct at the box level. What was missing were questions like:

  • What happens if this dependency is slow but not down?

  • How many retries are safe for the downstream?

  • At what point should we fail fast and surface a controlled error to the user?

You cannot learn that by solving a system design question in a sixty minute interview. You learn it when you sit with the system, watch it fail, and then update your mental model and your questions.

The "Lities" and the Reality of Production

In a whiteboard interview, the boxes always work. In production, the boxes are the easy part. The lines between them are what fail.

When you stay with a system in production, you start to care deeply about the “lities”:

  • Availability

  • Reliability

  • Maintainability

  • Scalability

You realise that a two terabyte database is not just a box. It is a living system you have to migrate, back up, and monitor. You learn the pain of trying to add a non null column with a default value to a massive table while doing thousands of writes per second, without a maintenance window.

Most of that pain comes from questions you did not ask early enough.

Availability raises questions like, “What does up really mean for this user?” and “Which failure modes are acceptable in which journey?”

Maintainability forces questions like, “How often will we need to change this schema?” and “Can a new engineer understand and safely modify this in an afternoon?”

Reliability adds, “What are the most likely partial failures, and how will this system degrade in those cases?”

Scalability is, “Where are our natural limits, what happens when we hit them, and what is the plan before we get there?”

You also encounter what I call the “Observability Paradox”.

Your system is failing. Dashboards are red. But the tool you use for monitoring is also lagging or failing because it cannot handle the volume of error logs and metrics.

You now have to answer, in real time, “Is the system actually down, or is the telemetry lying?”

You cannot learn that panic from a newsletter. You learn it when you are the one who has to make a call while the business is losing money.

On diagrams, boxes feel important. In production, the lines are where money is lost. Good architects learn to ask questions about the lines very early, instead of polishing the boxes.

Craft Over Titles

At One2N, our tagline is “Pragmatic Software Engineering, Released to Production”.

We built Prayogshala, our internal learning and R and D lab, to accelerate this kind of growth. It is a place where our engineers can get into the water before they touch a client project.

We do not run “Hello World” tutorials. We assign proof of concepts that reflect the constraints of the portfolios we work with, such as those of PeakXV or Accel.

An engineer might spend weeks on a proof of concept for OpenTelemetry or Victoria Metrics, not just to see if it works, but to see how it fails. Our Principal SRE, Saurabh Hirani (CHOTU), acts as the difficult customer. He pushes them to explain why they chose a specific branching strategy, what tradeoffs they are making on cost, and how their solution affects the bottom line.

This process turns curiosity into muscle memory. It ensures that when our team walks into a room, they are not just drawing boxes. They are bringing a craft that has been tested against the sharp edges of reality.

You do not need a formal lab to apply the same idea.

You can:

  • Run small proof of concepts that deliberately stress the edges of your chosen tools, such as migrations, failure drills, and load tests

  • Assign someone in every design review to play the difficult customer and only ask business and failure questions

  • Keep a log of “questions we wish we had asked earlier” after incidents and read it before starting a new architecture

Over time, this shifts your default from “How do I draw this nicely” to “What questions am I missing.”

Your influence as an engineer comes from your craft, not your title. Craft includes knowing which questions to ask before you write a single line of code. It includes the humility to realise that the code or design you wrote six months ago should probably be better today.

The best architects I know do not start with Kubernetes clusters and service meshes. They start with uncomfortable questions about what problem is real, what can fail, and what the team and the business are willing to live with.

If you are a curious engineer and want to work with a team that values production experience over whiteboard theater, have a look at our careers page. If you want to grow as an architect, start by changing the first thing you do.

Ask better questions. Draw the boxes later.

"How do I know I am growing in my career?"

An engineer asked me this in a 1:1 recently. I have heard it many times, usually from people who feel they are plateauing at a “Senior” level.

My answer is simple. I use a small test for myself. I look at the design or code I wrote six months ago. If I can see ways to improve it, I know I am growing. If I look at it and cannot find a single thing to make better, I know I have stagnated. I need to learn more.

You can replace “code” with anything: design, architecture, writing, a blog post, or even a business strategy. If you cannot look back and see the flaws in your past work, you have not moved forward. To grow, you have to work on your craft. Craft requires a feedback loop.

Good architecture is no different. If you review an architecture document from six months ago and you cannot see missing questions about failure modes, limits, or tradeoffs, you have probably stopped growing as an architect.

This is why your architecture should start with questions, not boxes.

The Curiosity vs Performance Trap

In my experience, the key differentiator between people I enjoy working with and people I do not is who asks questions out of curiosity.

Engineers and designers who ask questions from a curious lens instead of a performance lens are some of the best people to work with. Curious questions sound like:

  • Help me understand why we chose this approach?

  • What problem were we originally trying to solve here?

  • What happens if we do not do this at all?

Performance questions, on the other hand, sound like:

  • Why did nobody think of this obvious solution?

  • Should we not be doing X instead?

  • Who decided this was a good idea?

You are asking about the same technical thing, but with a completely different intent.

Curious people make teams smarter because they make assumptions visible. They allow the team to see things in a different way and make it safe to say, “I do not know yet.”

Performance driven questions make teams defensive. Even when they are right, they train the team to stay quiet the next time to avoid being put on the spot. They turn technical discussions into a blame game.

If you want to build a culture of “Peace of Mind Engineering”, you have to protect the curious and keep the performance theater out of the room.

You also have to channel that curiosity into the right questions, before anyone starts drawing boxes.

Questions to ask before drawing boxes

Before you open a diagramming tool, ask questions like:

  • What user or business pain are we actually trying to reduce?

  • What happens if we do not build this at all?

  • What is the simplest version that delivers most of the value?

  • What can fail, and how will we know before the customer does?

  • Which constraints are real (SLOs, budgets, team skills) and which are self imposed?

  • Who will own this in twelve to eighteen months, and what are the painful maintenance tasks?

The quality of your architecture is limited by the quality of questions you ask at this stage. The boxes only come after.

You Cannot Learn to Swim Without Getting into the Water

Many engineers think they can just read system design blogs and newsletters to get better at architecture. They practice system design interviews on whiteboards and assume that is enough experience to scale systems for a large business.

It is not.

You cannot learn how to swim by watching videos. At some point, you have to get into the water.

Blogs and newsletters are useful for high level awareness. But if you spend years consuming them without owning systems in production, you will stagnate as an architect.

There is no replacement for the experience of building and scaling systems in production. When I interview candidates, many have a decade of experience on paper. Very few have stayed with the same system long enough to see the feedback on their technical decisions.

If you leave a project three months after launch, you never see the sharp edges.

You never see what happens to your elegant microservices when a downstream dependency starts timing out at the ninety ninth percentile. On a whiteboard, you just draw a retry logic. In production, that retry logic can trigger a retry storm that takes down your entire infrastructure.

Here is a simple example.

We once had a service that depended on a payments API. On paper, the design looked fine. We added retries with exponential backoff. The box was labeled “resilient”.

Then one day, the downstream started responding slowly but did not actually fail. Latency increased, but there were no hard errors. Our service was configured with aggressive timeouts and retries. Every slow request triggered more retries. The load on the downstream went up, which made it even slower. Within minutes, the entire chain melted down.

The architecture diagram was correct at the box level. What was missing were questions like:

  • What happens if this dependency is slow but not down?

  • How many retries are safe for the downstream?

  • At what point should we fail fast and surface a controlled error to the user?

You cannot learn that by solving a system design question in a sixty minute interview. You learn it when you sit with the system, watch it fail, and then update your mental model and your questions.

The "Lities" and the Reality of Production

In a whiteboard interview, the boxes always work. In production, the boxes are the easy part. The lines between them are what fail.

When you stay with a system in production, you start to care deeply about the “lities”:

  • Availability

  • Reliability

  • Maintainability

  • Scalability

You realise that a two terabyte database is not just a box. It is a living system you have to migrate, back up, and monitor. You learn the pain of trying to add a non null column with a default value to a massive table while doing thousands of writes per second, without a maintenance window.

Most of that pain comes from questions you did not ask early enough.

Availability raises questions like, “What does up really mean for this user?” and “Which failure modes are acceptable in which journey?”

Maintainability forces questions like, “How often will we need to change this schema?” and “Can a new engineer understand and safely modify this in an afternoon?”

Reliability adds, “What are the most likely partial failures, and how will this system degrade in those cases?”

Scalability is, “Where are our natural limits, what happens when we hit them, and what is the plan before we get there?”

You also encounter what I call the “Observability Paradox”.

Your system is failing. Dashboards are red. But the tool you use for monitoring is also lagging or failing because it cannot handle the volume of error logs and metrics.

You now have to answer, in real time, “Is the system actually down, or is the telemetry lying?”

You cannot learn that panic from a newsletter. You learn it when you are the one who has to make a call while the business is losing money.

On diagrams, boxes feel important. In production, the lines are where money is lost. Good architects learn to ask questions about the lines very early, instead of polishing the boxes.

Craft Over Titles

At One2N, our tagline is “Pragmatic Software Engineering, Released to Production”.

We built Prayogshala, our internal learning and R and D lab, to accelerate this kind of growth. It is a place where our engineers can get into the water before they touch a client project.

We do not run “Hello World” tutorials. We assign proof of concepts that reflect the constraints of the portfolios we work with, such as those of PeakXV or Accel.

An engineer might spend weeks on a proof of concept for OpenTelemetry or Victoria Metrics, not just to see if it works, but to see how it fails. Our Principal SRE, Saurabh Hirani (CHOTU), acts as the difficult customer. He pushes them to explain why they chose a specific branching strategy, what tradeoffs they are making on cost, and how their solution affects the bottom line.

This process turns curiosity into muscle memory. It ensures that when our team walks into a room, they are not just drawing boxes. They are bringing a craft that has been tested against the sharp edges of reality.

You do not need a formal lab to apply the same idea.

You can:

  • Run small proof of concepts that deliberately stress the edges of your chosen tools, such as migrations, failure drills, and load tests

  • Assign someone in every design review to play the difficult customer and only ask business and failure questions

  • Keep a log of “questions we wish we had asked earlier” after incidents and read it before starting a new architecture

Over time, this shifts your default from “How do I draw this nicely” to “What questions am I missing.”

Your influence as an engineer comes from your craft, not your title. Craft includes knowing which questions to ask before you write a single line of code. It includes the humility to realise that the code or design you wrote six months ago should probably be better today.

The best architects I know do not start with Kubernetes clusters and service meshes. They start with uncomfortable questions about what problem is real, what can fail, and what the team and the business are willing to live with.

If you are a curious engineer and want to work with a team that values production experience over whiteboard theater, have a look at our careers page. If you want to grow as an architect, start by changing the first thing you do.

Ask better questions. Draw the boxes later.

"How do I know I am growing in my career?"

An engineer asked me this in a 1:1 recently. I have heard it many times, usually from people who feel they are plateauing at a “Senior” level.

My answer is simple. I use a small test for myself. I look at the design or code I wrote six months ago. If I can see ways to improve it, I know I am growing. If I look at it and cannot find a single thing to make better, I know I have stagnated. I need to learn more.

You can replace “code” with anything: design, architecture, writing, a blog post, or even a business strategy. If you cannot look back and see the flaws in your past work, you have not moved forward. To grow, you have to work on your craft. Craft requires a feedback loop.

Good architecture is no different. If you review an architecture document from six months ago and you cannot see missing questions about failure modes, limits, or tradeoffs, you have probably stopped growing as an architect.

This is why your architecture should start with questions, not boxes.

The Curiosity vs Performance Trap

In my experience, the key differentiator between people I enjoy working with and people I do not is who asks questions out of curiosity.

Engineers and designers who ask questions from a curious lens instead of a performance lens are some of the best people to work with. Curious questions sound like:

  • Help me understand why we chose this approach?

  • What problem were we originally trying to solve here?

  • What happens if we do not do this at all?

Performance questions, on the other hand, sound like:

  • Why did nobody think of this obvious solution?

  • Should we not be doing X instead?

  • Who decided this was a good idea?

You are asking about the same technical thing, but with a completely different intent.

Curious people make teams smarter because they make assumptions visible. They allow the team to see things in a different way and make it safe to say, “I do not know yet.”

Performance driven questions make teams defensive. Even when they are right, they train the team to stay quiet the next time to avoid being put on the spot. They turn technical discussions into a blame game.

If you want to build a culture of “Peace of Mind Engineering”, you have to protect the curious and keep the performance theater out of the room.

You also have to channel that curiosity into the right questions, before anyone starts drawing boxes.

Questions to ask before drawing boxes

Before you open a diagramming tool, ask questions like:

  • What user or business pain are we actually trying to reduce?

  • What happens if we do not build this at all?

  • What is the simplest version that delivers most of the value?

  • What can fail, and how will we know before the customer does?

  • Which constraints are real (SLOs, budgets, team skills) and which are self imposed?

  • Who will own this in twelve to eighteen months, and what are the painful maintenance tasks?

The quality of your architecture is limited by the quality of questions you ask at this stage. The boxes only come after.

You Cannot Learn to Swim Without Getting into the Water

Many engineers think they can just read system design blogs and newsletters to get better at architecture. They practice system design interviews on whiteboards and assume that is enough experience to scale systems for a large business.

It is not.

You cannot learn how to swim by watching videos. At some point, you have to get into the water.

Blogs and newsletters are useful for high level awareness. But if you spend years consuming them without owning systems in production, you will stagnate as an architect.

There is no replacement for the experience of building and scaling systems in production. When I interview candidates, many have a decade of experience on paper. Very few have stayed with the same system long enough to see the feedback on their technical decisions.

If you leave a project three months after launch, you never see the sharp edges.

You never see what happens to your elegant microservices when a downstream dependency starts timing out at the ninety ninth percentile. On a whiteboard, you just draw a retry logic. In production, that retry logic can trigger a retry storm that takes down your entire infrastructure.

Here is a simple example.

We once had a service that depended on a payments API. On paper, the design looked fine. We added retries with exponential backoff. The box was labeled “resilient”.

Then one day, the downstream started responding slowly but did not actually fail. Latency increased, but there were no hard errors. Our service was configured with aggressive timeouts and retries. Every slow request triggered more retries. The load on the downstream went up, which made it even slower. Within minutes, the entire chain melted down.

The architecture diagram was correct at the box level. What was missing were questions like:

  • What happens if this dependency is slow but not down?

  • How many retries are safe for the downstream?

  • At what point should we fail fast and surface a controlled error to the user?

You cannot learn that by solving a system design question in a sixty minute interview. You learn it when you sit with the system, watch it fail, and then update your mental model and your questions.

The "Lities" and the Reality of Production

In a whiteboard interview, the boxes always work. In production, the boxes are the easy part. The lines between them are what fail.

When you stay with a system in production, you start to care deeply about the “lities”:

  • Availability

  • Reliability

  • Maintainability

  • Scalability

You realise that a two terabyte database is not just a box. It is a living system you have to migrate, back up, and monitor. You learn the pain of trying to add a non null column with a default value to a massive table while doing thousands of writes per second, without a maintenance window.

Most of that pain comes from questions you did not ask early enough.

Availability raises questions like, “What does up really mean for this user?” and “Which failure modes are acceptable in which journey?”

Maintainability forces questions like, “How often will we need to change this schema?” and “Can a new engineer understand and safely modify this in an afternoon?”

Reliability adds, “What are the most likely partial failures, and how will this system degrade in those cases?”

Scalability is, “Where are our natural limits, what happens when we hit them, and what is the plan before we get there?”

You also encounter what I call the “Observability Paradox”.

Your system is failing. Dashboards are red. But the tool you use for monitoring is also lagging or failing because it cannot handle the volume of error logs and metrics.

You now have to answer, in real time, “Is the system actually down, or is the telemetry lying?”

You cannot learn that panic from a newsletter. You learn it when you are the one who has to make a call while the business is losing money.

On diagrams, boxes feel important. In production, the lines are where money is lost. Good architects learn to ask questions about the lines very early, instead of polishing the boxes.

Craft Over Titles

At One2N, our tagline is “Pragmatic Software Engineering, Released to Production”.

We built Prayogshala, our internal learning and R and D lab, to accelerate this kind of growth. It is a place where our engineers can get into the water before they touch a client project.

We do not run “Hello World” tutorials. We assign proof of concepts that reflect the constraints of the portfolios we work with, such as those of PeakXV or Accel.

An engineer might spend weeks on a proof of concept for OpenTelemetry or Victoria Metrics, not just to see if it works, but to see how it fails. Our Principal SRE, Saurabh Hirani (CHOTU), acts as the difficult customer. He pushes them to explain why they chose a specific branching strategy, what tradeoffs they are making on cost, and how their solution affects the bottom line.

This process turns curiosity into muscle memory. It ensures that when our team walks into a room, they are not just drawing boxes. They are bringing a craft that has been tested against the sharp edges of reality.

You do not need a formal lab to apply the same idea.

You can:

  • Run small proof of concepts that deliberately stress the edges of your chosen tools, such as migrations, failure drills, and load tests

  • Assign someone in every design review to play the difficult customer and only ask business and failure questions

  • Keep a log of “questions we wish we had asked earlier” after incidents and read it before starting a new architecture

Over time, this shifts your default from “How do I draw this nicely” to “What questions am I missing.”

Your influence as an engineer comes from your craft, not your title. Craft includes knowing which questions to ask before you write a single line of code. It includes the humility to realise that the code or design you wrote six months ago should probably be better today.

The best architects I know do not start with Kubernetes clusters and service meshes. They start with uncomfortable questions about what problem is real, what can fail, and what the team and the business are willing to live with.

If you are a curious engineer and want to work with a team that values production experience over whiteboard theater, have a look at our careers page. If you want to grow as an architect, start by changing the first thing you do.

Ask better questions. Draw the boxes later.

In this post
In this post
Section
Section
Section
Section
Share
Share
Share
Share
In this post

test

Share
Keywords

Software Architecture, System Design, Site Reliability Engineering (SRE), Technical Leadership, Engineering Career Growth, Production Readiness

Continue reading.

Subscribe for more such content

Get the latest in software engineering best practices straight to your inbox. Subscribe now!

Hold the button for 3 seconds to verify you're human.

Subscribe for more such content

Get the latest in software engineering best practices straight to your inbox. Subscribe now!

Hold the button for 3 seconds to verify you're human.

Subscribe for more such content

Get the latest in software engineering best practices straight to your inbox. Subscribe now!

Hold the button for 3 seconds to verify you're human.

Subscribe for more such content

Get the latest in software engineering best practices straight to your inbox. Subscribe now!

Hold the button for 3 seconds to verify you're human.

Subscribe for more such content

Get the latest in software engineering best practices straight to your inbox. Subscribe now!

Hold the button for 3 seconds to verify you're human.