Discovery, Data Collection, and Domain Knowledge Acquisition

R.P. Churchill

Discovery vs. Data Collection

Discovery is a qualitative process. It identifies nouns (things) and verbs (actions, transformations, decisions, calculations).

Data Collection is a quantitative process. It identifies adjectives (colors, dimensions, rates, times, volumes, capacities, materials, properties).

Discovery comes first, so you know what data you need to collect.

Imagine you're going to simulate or automate the process. What values do you need? This is the information the implementation teams will need.

Context of Discovery

Elicitation is discovering the customer's needs. Discovery is about mapping the customer's existing process.

If there is no existing process, i.e., a new (greenfield) process is being built from scratch, then a form of discovery will occur during Requirements and Design.

Process Mapping is a big part of discovery. A picture is worth a thousand words. An animation is worth even more.

Process Mapping (continued)

Processes may be mapped differently based on needs, industry standards, and the information to be represented.

Mechanical Pulping Mill, Quebec City, QC
Offgas System in BWR Nuclear Power Plant, Richland, WA
Land Border Port of Entry, Columbus, NM
Architecture Study: General Purpose Discrete-Event Simulation
Process Mapping (continued)
  • S-I-P-O-C vs. C-O-P-I-S
  • Any number of inputs and outputs are possible.
Process Mapping (continued)

I give specific names to modular components.

Data Collection (Process Characterization)
  • Captures qualitative descriptions of entity types and characteristics, process types and characteristics, and decisions made.
  • Captures quantitative data:
    • physical dimensions, volumes, and storage capacities
    • arrival and departure rates and times
    • diversion percentages (what parts of outputs go where)
    • process durations
    • whatever is needed to describe transitions
    • counts or quantities of what's stored
    • velocities, frequencies, and fluxes
    • number of stations in each sub-process

    Discovery and Data Collection both make use of the Observation technique in the BABOK. Methods include:

    • Walkthroughs: guided and unguided (Waste Walk)
    • Drawings, Documents, Manuals, Specifications
    • Electronic Collection (real-time vs. historical, integrated vs. external sensors)
    • Visual / In-Person (notes, logsheets, checklists, mobile apps)
    • Interviews (with SMEs)
    • Surveys
    • Video
    • Photos
    • Calculations
    • Documented Procedures and Policies
Domain Knowledge Acquisition
  • Domain knowledge is acquired from prior experience or training or from the process SMEs as you go.
  • What you need to know to:
    • capture process details
    • analyze the operations
    • perform calculations
    • make sure you don't miss anything
BA Knowledge vs. SME Knowledge
  • As a BA you need to know how to analyze processes in general.
  • You should have an arsenal of techniques, tools, and understandings to bring to the process.
  • You don't need to know everything the SMEs know, and it's kind of the point that you won't, but you need to know enough to understand them.
  • Build up a glossary and a shared knowledge base.
BA Knowledge vs. SME Knowledge (continued)
  • Some fields require specific knowledge. Thermodynamics, medicine, or law, for example.
  • In other situations you only need to be able to fit SME knowledge into a business-oriented process flow. That is what you own and what you bring.
    • If you are automating a well-understood process then you're just bringing the knowledge of the automation technique.
    • If you are improving an existing process you bring your knowledge of rearrangement and substitution techniques.
    • If you are helping to design a new process you bring your knowledge of general principles of process design and the theory of constraints.
BA Knowledge vs. SME Knowledge (continued)
  • You should review all findings and proposals iteratively until you and your customers agree that everything is understood or at least properly recorded and organized.
  • You should know and respect what you don't know.
Making Sure The Analysis Is Thorough


  • Consider all elements from all angles.
  • UML is a formal example. The BABOK is more diffuse.
  • Simulation is great for understanding because the results show if everything's included.
  • Formal mathematical proofs of correctness and optimality exist for some problems.
  • Customer review and realized results are the best proof for most BA engagements.
Detailed Articles


Data Collection:

Domain Knowledge Acquisition:

This presentation and other information can be found at my website: