Writing Documents for Search and AI Knowledge Assistants
A practical guide for anyone who creates policy documents, user guides, or procedural manuals that will be searchable through enterprise search or ingested by an AI knowledge chatbot. How you structure and write your document directly affects whether the right information is found, and whether the answers users receive are accurate and complete.
Structure Your Document with Clear Headings
Headings are the most important thing you can get right. Both search engines and AI knowledge chatbots use them to understand which section a piece of text belongs to and how sections relate to each other.
Use heading styles, not manual formatting
In Word, use the built-in Heading 1, Heading 2, and Heading 3 styles from the Styles ribbon. Do not create headings by making text bold or increasing the font size. Search systems and AI chatbots read the style name to understand the document's hierarchy. A bold line in 14pt font looks like a heading to a human reader, but the system treats it as regular body text.
In PDF documents, use visually distinct sizes for each heading level (e.g., 16pt for major sections, 14pt for subsections, 12pt for sub-subsections). If all your headings look the same, the system relies on numbering patterns (A., 1., i.) to figure out the hierarchy. This works, but is less reliable.
Make headings describe the topic, not the format
A heading should tell the reader (and the system) what the section is about. Generic labels like "Procedure", "Details", or "Step 2" do not provide that signal.
| Instead of | Try |
|---|---|
| Procedure | How to Create a Purchase Order |
| Step 2 | Step 2: Assign Charge of Account |
| Details | Fund Status: Reserved vs. Partially Liquidated |
| Notes | Important Limitations and Exceptions |
| Appendix A | Appendix A: Approval Thresholds by Amount |
Don't skip heading levels
Go from Heading 1 to Heading 2 to Heading 3 in order. Don't jump from Heading 1 directly to Heading 3.
Keep Sections Focused
One topic per section
Each section should cover one topic. If a single section discusses amendments, closures, and cancellations, the system has difficulty determining which topic the section is really about, and it becomes a weak match for any specific question.
Aim for 200 to 800 words per section
Sections shorter than 200 words may lack enough context for the system to understand what they are about. Sections longer than 800 words are more likely to contain multiple sub-topics, which dilutes the focus.
If a section naturally runs long (for example, a multi-step procedure), break it into subsections with their own headings:
Use the Words Your Readers Would Search For
Use full terms alongside abbreviations
If your document uses "COA" throughout, but your readers search for "charge of account" or "account code", the system may not connect the two. Spell out the abbreviation on first use in each major section, not just once at the top of the document.
| Document uses | Readers might search for | Recommendation |
|---|---|---|
| COA | Charge of account, account code | Write "Charge of Account (COA)" in each section that discusses it |
| REQ | Requisition, purchase request | Write "Requisition (REQ)" |
| GL Code | Account code, expenditure type | Mention both terms near each other |
| PO line cancellation | How to cancel a PO item | Include "cancel" and "item" alongside formal terms |
Start sections with the key concept
The first sentence of each section carries extra weight. Begin with the topic, not with a preamble.
Write Self-Contained Sections
Don't rely on context from earlier in the document
A reader may land on any section of your document without having read the beginning. Each section should make sense on its own.
- Re-state who is responsible ("the buyer", "the requisitioner") rather than writing "the same person" or "they"
- Name the system or module ("in the Purchase Order module") rather than "in the same screen"
- Repeat the scenario ("when amending the Charge of Account") rather than "in this case"
Prefer exact section headings in cross-references
When referencing other parts of your document, use the exact section heading rather than relative language.
| Instead of | Try |
|---|---|
| See above | See "E. Amendments of PO" |
| Refer to the previous section | Refer to "Fund Status: Reserved" |
| See page 12 | See "Step 2: Assign Charge of Account" |
Working with Tables
Tables are a good way to present structured information such as comparison matrices, field descriptions, and threshold values. The system keeps tables intact, so the information stays together.
- Keep tables concise. A table with 30 rows may push surrounding text away from it, separating the table from its context. If you have a large dataset, consider splitting it into smaller tables under subsection headings.
- Add a descriptive heading above each table. The heading helps the system understand what the table contains. A table under "Approval Thresholds by Amount" is easier to find than one under "Table 3".
- Use clear column headers. They help both the reader and the system interpret the data.
Working with Images and Screenshots
AI knowledge chatbots can read images and convert them into searchable text. How well this works depends partly on how you present images in your document.
Place screenshots immediately after the step they illustrate
When a screenshot appears right after "Click Submit to route the PO for approval", the system uses the surrounding text to understand what the screenshot shows. If the screenshot is separated from its related instruction by a page break or unrelated content, the system has less context to work with.
Number your steps explicitly
If your procedure uses numbered steps (Step 1, Step 2, or 1., 2., 3.), the system detects this pattern and structures its interpretation of the screenshots accordingly. Implicit ordering ("First... Then... Next...") is harder for the system to follow.
Name UI elements in the text
If your instruction says "Click Submit" and the screenshot shows a button labeled "Submit", the system can connect the two. If your instruction says "Click the button at the bottom of the page", the system has to infer which button you mean from the screenshot alone.
Don't put critical information only in images
If a flowchart or decision tree contains important logic (e.g., "If amount > $50,000, route to Director for approval"), describe the key points in the body text as well. The system processes images, but text-based content is more reliably searched and retrieved.
Choosing a File Format
| Word (DOCX) | SharePoint Page | ||
|---|---|---|---|
| Heading support | Best. Word styles translate directly into the document hierarchy | Good if headings have distinct sizes or numbering patterns; fragile if all headings look the same | Good. HTML heading tags are unambiguous |
| Tables | Well preserved | Mostly preserved; complex layouts may lose structure | Well preserved |
| Images | Fully supported | Fully supported | Supported if image processing is enabled |
| Best for | Procedural guides and policy documents | Finalized documents with consistent formatting | Quick-reference pages and announcements |
If you have the choice, Word (DOCX) is the most reliable format. The heading styles you set in Word are preserved exactly, with no inference or guesswork needed. PDF works well when you maintain consistent visual formatting, but the system has to interpret the visual layout rather than reading explicit structure.
Pre-Publication Checklist
- Headings use built-in styles (Word) or distinct formatting (PDF), not manual bold/font changes
- No heading levels are skipped (Heading 1 > Heading 2 > Heading 3)
- Each heading describes the section's topic in concrete terms
- Sections cover one topic each and are 200 to 800 words
- Abbreviations are spelled out on first use in each major section
- Each section starts with its key concept, not a preamble
- Sections make sense on their own without needing earlier context
- Cross-references use exact section headings, not "see above" or page numbers
- Tables have descriptive headings and are concise
- Screenshots are placed immediately after the step they illustrate
- Steps are explicitly numbered
- Important information in images is also described in body text
Reference Template
A concrete before-and-after example that authors can model when creating or revising documents for search and AI knowledge chatbots, plus a reusable template structure and reviewer checklist.
Before: A Common but Poorly Structured Section
Section 4
When employees travel for work, they may be eligible for reimbursement of expenses incurred during the trip. It should be noted that all expenses must be pre-approved and properly documented. The process is outlined below.
For flights, you should book through the approved travel management company (TMC). If you booked outside the TMC, please provide justification. Hotel costs are also reimbursable, subject to the daily rate cap. Meals follow the DSA. For ground transport, taxis and rideshares are allowed; rental cars require prior approval. Receipts are required for all expenses over $25. See above for the approval process. Refer to Appendix B for the rate table.
What's wrong:
- Heading says "Section 4" (not descriptive)
- Opens with a preamble instead of the topic
- Mixes four sub-topics in one section (flights, hotels, meals, ground transport)
- Uses "DSA" without spelling it out
- Says "See above" and "Refer to Appendix B" (not searchable cross-references)
After: Revised to Follow the Guide
Travel Expense Reimbursement
Employees may claim reimbursement for pre-approved expenses incurred during work-related travel. All claims must be submitted within 30 calendar days of the trip end date using the Expense Report form in the ERP system.
Flights
All flights must be booked through the approved Travel Management Company (TMC). Economy class is the default for trips under six hours; premium economy may be approved for longer flights with prior written authorization from the traveler's supervisor.
If a flight was booked outside the TMC, the employee must include a written justification in the expense report explaining why the TMC was not used. Reimbursement will be limited to the lowest available TMC fare for the same route and dates.
Hotels
Hotel costs are reimbursable up to the Daily Subsistence Allowance (DSA) lodging rate for the destination city. The DSA rates are published in the "DSA Rate Table by City" section of the Travel Policy Handbook.
If the hotel cost exceeds the DSA lodging rate, the employee must provide written justification (e.g., only hotel within safe distance of the meeting venue). Excess costs without justification are the employee's responsibility.
Meals
Meal expenses are reimbursed using the Daily Subsistence Allowance (DSA) meal rate for the destination city. Actual receipts are not required for meals within the DSA rate. Meals exceeding the DSA rate are not reimbursable.
When meals are provided by the host organization or included in a conference registration, the DSA meal rate for that day is reduced by 30% per provided meal.
Ground Transportation
Taxis and rideshare services (e.g., Uber, Lyft) are reimbursable for work-related travel between the airport, hotel, and meeting venue. Receipts are required for any single trip over $25.
Rental cars require prior written approval from the traveler's supervisor. When approved, reimbursement covers the rental fee, fuel, tolls, and parking. Insurance coverage must be verified with the Risk Management Office before booking.
Receipts and Documentation
Receipts are required for all individual expenses over $25. For expenses of $25 or less, the expense report line item description is sufficient.
All receipts must be uploaded as attachments to the Expense Report in the ERP system. Photos of paper receipts are acceptable provided the date, vendor name, amount, and items are legible.
What Changed and Why
| Principle | Before | After | Why it helps |
|---|---|---|---|
| Descriptive heading | "Section 4" | "Travel Expense Reimbursement" | A user searching for "travel reimbursement" now matches the heading directly |
| One topic per section | Flights, hotels, meals, and ground transport all in one paragraph | Each expense type has its own subsection | A question about "rental car approval" finds only the ground transport section |
| Heading hierarchy | One level only | H2 for the main topic, H3 for each expense type | The system knows "Flights" is part of "Travel Expense Reimbursement" and includes that context |
| Front-loaded concept | "When employees travel for work, they may be eligible..." | "Employees may claim reimbursement for pre-approved expenses..." | Opens with actionable information, not a conditional preamble |
| Abbreviations | "Meals follow the DSA" | "Daily Subsistence Allowance (DSA)" in each section | A reader landing on the Meals section understands "DSA" without reading earlier sections |
| Self-contained | "See above for the approval process" | Each section states its own approval requirements | A reader gets a complete answer from any single section |
| Exact cross-refs | "Refer to Appendix B" | "the DSA Rate Table by City section of the Travel Policy Handbook" | The system can find and retrieve the referenced section |
| Section length | One paragraph (~150 words) | Each subsection is 60 to 120 words; full section ~350 words | Each sub-topic answers a specific question on its own |
Template Structure
Use this outline as a starting point when creating a new policy document or user guide. Replace the bracketed placeholders with your content.
Image and Screenshot Placement
When documenting a system procedure with screenshots, follow this pattern:
Key points:
- Each step names the UI element ("the Field Name field", "Click Submit")
- Screenshots appear directly after the step they illustrate, not grouped at the end
- Steps are explicitly numbered (1, 2, 3), not implied ("First... Then... Next...")
- The heading describes the outcome ("Assign Charge of Account"), not the format ("Step 2")
Checklist for Reviewers
When reviewing a document before publication, verify:
- Every heading describes a topic (no "Section 3", "Procedure", or "Details")
- Heading levels follow the sequence: Heading 1, then Heading 2, then Heading 3 (no levels skipped)
- Each section covers one topic and is 200 to 800 words
- Abbreviations are spelled out on first use in each major section
- The first sentence of each section states the key concept
- No section says "see above", "as mentioned previously", or references a page number
- Cross-references use the exact heading of the target section
- Tables have descriptive headings and clear column headers
- Screenshots follow the step they illustrate (not grouped at the end)
- Steps are explicitly numbered
- Critical information in images is also stated in the body text