Introduction
Writing prompts for generative image and video models is a new skill that builds on communication abilities you already have. Just like giving creative direction to a colleague, prompting requires you to articulate your vision clearly.
The key difference is that generative models interpret your words more literally and lack the shared context that a colleague might have. If you prompt a beautiful landscape, the model doesn't know whether you envision mountains at sunset or a tropical beach at noon:
| Prompt: A beautiful landscape | |
Both interpretations are technically correct, but may not match your vision.
This guide explains how to write effective prompts by starting simple and adding detail strategically, using positive language, and embracing iteration as part of the creative process.
Related links
Iteration is part of the process
Before diving into techniques, understand that getting the perfect result may not happen on the first try. Creative work (whether you're writing, designing, or filming) involves drafting, collaborating, and refining. The same is normal and expected when working with generative media.
Think of prompting as a conversation with the model: You make a request, review the response, then clarify or expand your request based on what you see. Each generation teaches you something new about how the model interprets your words.
| Iteration 1 | Iteration 2 | Iteration 3 |
| A serene pond with koi fish | High angle looking down at a serene pond with koi fish | High angle looking down at a serene pond. A koi fish emerges and breaches the surface, sending gentle ripples through the surrounding lily pads. |
This iterative approach is often intentionally leveraged in workflows by seasoned creators, as it lets you refine your vision as you simultaneously explore possibilities you hadn't imagined.
Prompting strategies
There are two main approaches when writing an initial prompt. Each has distinct advantages depending on your workflow:
- Starting simple lets you add one element at a time and see what each change does.
- Starting detailed can reduce the total steps in your iteration process, getting you closer to a desired result faster.
True or False? Detailed prompts are always better than simple prompts.
False. Simple and detailed prompts are both valid and effective approaches to prompting.
Seasoned creators freely switch between both strategies as need for exploration (simple) or refining results (detailed).
| Strategy | Pros | Cons |
|---|---|---|
| Simple |
|
|
| Detailed |
|
|
Click each strategy type to learn more and review examples:
Simple
You don't always need elaborate prompts. Simple instructions like "a cat sitting on a windowsill" or "a person walking through a city street" can produce excellent results. These prompts give the model creative freedom to interpret details like lighting, camera movement, composition, and style.
Think of simple prompts as sketches. They're fast, flexible, and perfect for early creative exploration. When a basic prompt doesn't quite work, you can always add more details to refine the result.
When starting simple, you're building on a foundation rather than troubleshooting a paragraph of details.
Detailed
Detailed prompts help when you need results that match particular requirements, like a specific mood, setting, or visual style. Starting with more detail will reduce the total number of iterations needed to work towards your final result.
When adding detail to a prompt, use these fundamental questions to reduce ambiguity:
| Who or what is the subject? | Be specific about the main focus. "A person" could be anyone, but "a serious woman in her 30s wearing business attire" creates a clearer picture. |
| What is happening? | Describe the action or state. "Standing" and "running" create very different results, even with the same subject. |
| When does this take place? | Time of day affects lighting and mood. "Dawn" looks different from "midnight" or "midday." |
| Where is this happening? | Setting provides context. "In a forest" differs from "in a modern office" or "on a busy street." |
| How should it look? | This covers style, mood, and more technical aspects like camera motion. Consider the atmosphere (peaceful, energetic, mysterious) and visual approach (photographic, illustrated, cinematic). |
The goal of detail is to remove ambiguity, not to control every pixel.
Extremely complex, multi-paragraph prompts reduce the room for creative freedom a model has, constraining it to operate within tightly defined parameters. This over-specification can paradoxically lead to unexpected or unnatural results, as the model struggles to honor every detail simultaneously.
Best practices
Following these best practices will improve your results:
Use positive phrasing
Models respond better to descriptions of what you want to happen rather than what you don't want to happen.
-
❌ not blurry -
✅ sharp focus, high detail
Positive phrasing works because models are trained to create what you describe. When you say what not to include, the model still has to interpret that concept and may include it anyway.
Think of it like giving directions: "turn left at the library" is clearer than "don't turn right at the library."
Avoid ambiguous or conceptual language
Models interpret words literally and may not understand subjective or abstract terms the same way you do. Words like "beautiful," "professional," or "interesting" mean different things to different people.
-
❌ A beautiful sunset -
✅ a sunset with vibrant orange and pink clouds over the ocean
Concrete descriptions work better because they give the model specific visual elements to create. Instead of saying something should look "modern," describe what modern means to you: clean lines, minimal decoration, neutral colors, or large windows.
Avoid conflicting instructions
When your prompt contains contradictory details, the model will try to honor all of them, often resulting in unclear or unexpected outcomes.
-
❌ dramatic shadows with soft, even lighting -
✅ dramatic shadows with strong directional light
Review your prompt for elements that work against each other. Requesting both "peaceful, calm atmosphere" and "energetic, dynamic action" sends mixed signals. Asking for "vintage 1920s style" alongside "modern minimalist aesthetic" creates confusion about the overall look.
Troubleshooting results
When you're not getting the results you want after a few tries, step back and use these strategies to diagnose the issue and iterate your results.
Use Chat Mode to refine or rephrase your prompt
Use Chat Mode as a collaborative partner. Explain what you're trying to create and what's not working in the results. For example:
I'm trying to create a cozy coffee shop interior, but the images keep coming out too dark and empty. Can you help me rephrase my prompt?
Chat Mode can suggest alternative phrasing, point out ambiguous language, or ask clarifying questions that help you articulate your vision more precisely. This is especially helpful when you're struggling to describe what you want in words.
Simplify complex prompts
If you started with a detailed prompt that isn't working, try stripping it down to the core concept. Remove all the style descriptors, lighting details, and compositional notes. Start with just the subject or most important elements of the generation.
Once you have a basic version that's closer to what you want, add back one detail at a time:
- Does adding "soft lighting" improve the result or make it worse?
- What about "wide angle shot"?
This process helps you identify which details are helping and which might be causing problems or conflicting with each other.
Reinforce missed elements through natural language
If a component isn't present or executed in an initial generation, try iterating your prompt to reinforce it through natural language. Iteration is a normal part of the process when working with generative media, much like the drafting phases of other creative processes.
In example, if we didn't receive a high angle when prompting
High angle of a koi fish pond for our first generation,
we would
reinforce the angle by iterating with
High angle looking down at a koi fish pond.
Know when and how to move on
Not every vision is achievable in a single generation, and that's okay. If you've tried multiple approaches and you're still not getting what you need, consider whether:
- The concept might work better with a different approach or model
- You can make final adjustments with Aleph or local editing tools
FAQ
Do JSON prompts provide more accurate results than natural language prompts?
JSON prompts give the placebo effect of being more accurate. This is because JSON formatting forces creators to break down their visual concepts when they may not otherwise do so. Ultimately, JSON formatting is ignored by generative models– what matters is the detail provided within the prompt.
This can be accomplished more simplistically by asking the who, what, when, where, and how questions, which is what we recommend.