From Principle to Form – How to Use LLMs Correctly

Arbitrary, unstructured use of large language models (LLMs) often leads to topic divergence, hallucination, and a phenomenon one might call AI psychosis. To counteract this, I present a systematic workflow that transforms a raw idea into a rigorous, actionable plan while maintaining responsible LLM usage.

1. The Core Workflow

The process is built around three fundamental stages, connected as shown below.

$$ \begin{array}{c} \boxed{E_1\;(\text{INFO DUMP})} \;\overset{\text{BRAINSTORM}}{\longleftrightarrow}\; \boxed{E_2\;(\text{DRAFT})} \;\underset{\text{ASK FOR L1}}{\overset{\text{REFINE ARGUMENT}}{\longleftrightarrow}}\; \boxed{% \begin{array}{c} \text{LLM INFRASTRUCTURE} \\ (E_3) \\[4pt] \begin{array}{c} & \bullet & \\ \diagup & \vert & \diagdown \\ \bullet & - & \bullet \\ \diagdown & \vert & \diagup \\ & \bullet & \end{array} \end{array} } \end{array} $$

Informal description of the stages:
\(E_1\) (Info Dump): an initial, unstructured seed of ideas, observations, and desired outcomes.
\(E_2\) (Draft): a first structured version, an early attempt to formalize the core argument or plan.
\(E_3\) (LLM Infrastructure): a controlled, multi‑agent LLM system equipped with search capabilities and a predefined family of critical questions, denoted L1.

The L1 family of questions includes:

  • Holes / unconsidered factors
  • Lack of justification for a certain point
  • Relevant ideas and literature
  • Linguistic preciseness check
  • Novel ideas that emerge / domain expansion
  • Feasible mathematization of certain concepts
  • Feasibility and novelty check against previous relevant ideas and literature

2. Formalising the Iterative Dynamics

Definition (Brainstorming). The process \(E_1 \rightleftarrows E_2\) is called brainstorming. Its purpose is to find edge cases, poke holes in the draft, assume that assumed systems will fail, and to formalise such insights into a refined structure.

Definition (Refinement). The process \(E_2 \rightleftarrows E_3\) is called refinement. Here the draft is interrogated with L1‑questions, the LLM infrastructure provides answers and critiques, and the draft is updated accordingly.

Let us associate a cost and a benefit with the \(i\)-th iteration of each process. Denote \[ B_{\text{B}}(i),\ C_{\text{B}}(i) \quad \text{(brainstorming)},\qquad B_{\text{R}}(i),\ C_{\text{R}}(i) \quad \text{(refinement)}, \] where \(B\) and \(C\) stand for marginal benefit and marginal cost of that single iteration.

Stopping rule. The process should continue only as long as the marginal benefit exceeds the marginal cost. Formally, let \[ n_{\text{B}} = \min\{\,i \mid C_{\text{B}}(i) \ge B_{\text{B}}(i)\,\},\qquad n_{\text{R}} = \min\{\,i \mid C_{\text{R}}(i) \ge B_{\text{R}}(i)\,\}. \] The economically sensible number of iterations is \(n_{\text{B}}-1\) for brainstorming and \(n_{\text{R}}-1\) for refinement. The total number of iterations performed is \[ N = n_{\text{B}} + n_{\text{R}} . \]

3. Diminishing Returns and the Role of Initial Quality

Note: The decay models and associated assumptions presented in this section are simplified approximations and should not be interpreted as exact mathematical models of the process. They serve only to illustrate the concept of diminishing returns in a general, non‑prescriptive way.

Empirical evidence and theoretical considerations suggest that marginal benefits decay. Two natural models are \[ B_X(i) = \beta_0\,\exp(-\lambda\cdot i),\qquad B_X(i) = \frac{\beta_0}{i^{\alpha}}, \] while the marginal cost often grows linearly: \[ C_X(i) = c_0 + \varepsilon\cdot i . \] Here \(\lambda, \alpha > 0\) are decay rates determined by the LLM architecture (system prompts, model choice, etc.), and \(\beta_0\) is the largest possible per‑iteration benefit.

The simple models above are only rough approximations. In particular, we reject the assumption that the marginal cost of refinement is constant; marginal costs vary with iteration, context, and the specific L1 questions asked. Therefore we do not derive closed‑form expressions based on constant costs, and the stopping condition is applied iteratively without assuming constancy.

A crucial parameter is the quality of the initial seed \(E_1\). Define \(k \in [0,1]\) as the fraction of all foreseeable edge cases, failure modes, and systemic structures that are already captured in \(E_1\) through exhaustive first‑principles decomposition. A higher \(k\) reduces the remaining benefit obtainable from brainstorming. However, precise formulas linking \(k\) to the optimal number of iterations depend on unknown cost and benefit shapes; we avoid assuming constant marginal costs. In general, better initial quality leads to fewer required iterations, but the exact relationship is context‑dependent.

Multilingual token efficiency (Sapir–Whorf compression). When agents communicate internally in languages such as Chinese, German, or Russian, conceptual compression can lower the total number of iterations \(N\) and improve token and energy efficiency. This directly enhances the overall viability of the workflow by reducing computational overhead.

4. The Multi‑Agent LLM Infrastructure

The LLM Infrastructure \(E_3\) must possess a search ability. I typically employ a four‑agent system with the following roles:

(i) Main Processor. Receives human input and expected output, addresses the prompt fully with zero hallucination, and provides inventive examples beyond any given. Its internal reasoning follows concepts such as abstraction laddering, Ishikawa diagrams, impact‑effort matrix evaluation, first‑principles thinking, and \(n\)-th order thinking.

(ii) Example Feeder. Supplies many examples, sub‑domains, edge cases, variations, and categories of the user‑defined domain. For instance, if the user explores an idea in abstract algebra, the Example Feeder would list homomorphisms, ring ideals, etc.

(iii) Data Explorer. An agent with internet access that searches and understands human knowledge in forums, social media, white papers, pre‑prints, and other sources.

(iv) Creative Thinker. Generates innovative concepts, strategies, narratives, and designs by fusing existing knowledge with reasonable extrapolation and exploration of abstract concepts and interdisciplinary applications. It thinks in a logical chain to avoid hallucination.

5. Prompt Engineering and Workflow Efficiency

The decay parameters \(\lambda\) (exponential) and \(\alpha\) (power‑law) are not fixed properties of the LLM alone; they are strongly influenced by the system prompts assigned to each agent. Well‑engineered prompts that enforce step‑by‑step reasoning, constrain hallucinations, and focus output on the L1 questions increase the effective decay rate — i.e., they cause marginal benefits to drop faster per iteration. Conversely, vague or under‑specified prompts produce shallow improvements over many rounds, corresponding to a smaller \(\lambda\) or \(\alpha\).

Concretely, for brainstorming, tighter prompts that demand first‑principles decomposition and edge‑case enumeration from the start lead to a higher effective \(\lambda\), reducing \(n_{\text{B}}\) before the stopping condition is met. For refinement, prompts that force the agents to directly answer each L1 question with citations or explicit reasoning increase \(\alpha\) in the power‑law model, collapsing the need for repeated iterations. Thus, careful prompt engineering directly improves workflow efficiency by shrinking \(n_{\text{B}}\) and \(n_{\text{R}}\).

The initial quality parameter \(k\) (the fraction of edge cases, failure modes, and structures already present in \(E_1\)) exhibits a loose inverse proportional relationship with the total number of iterations \(N\): \[ N \;\propto\; \frac{1}{k} \qquad \text{(approximately)}. \] That is, doubling the thoroughness of the initial info dump roughly halves the number of iterations required. This relationship is not exact, it depends on the decay shapes and the cost structure, but it serves as a useful rule of thumb: invest upfront in a rich, well‑structured \(E_1\) to minimise later iterative overhead.

Overall, the total number of iterations \(N\) is approximately inversely proportional to the initial quality \(k\) and to the decay rates \(\lambda\) (exponential) and \(\alpha\) (power‑law), while being directly proportional to the initial marginal benefit \(\beta_0\) and to the overhead introduced by poor prompt engineering (i.e., lower effective \(\lambda\) or \(\alpha\)). Multilingual Sapir–Whorf compression reduces \(N\) by increasing conceptual density per token, thus acting as an inverse factor. Taken together, \(N \propto \frac{\beta_0}{\,k \cdot \lambda \cdot \alpha \cdot \text{(compression)}\;}\) in a loose, approximate sense, with all relationships being non‑exact and context‑dependent.

Hyper‑structured initialization, early cross‑domain injection, explicit mathematization, and a disciplined multi‑agent architecture together guarantee that the journey from a raw principle to a rigorous, actionable form is as efficient as possible. The cost‑benefit formalism presented here gives a precise language for deciding when to stop, and the Sapir–Whorf‑style multilingual compression further amplifies the economic viability of the whole pipeline.