Module 2: Health Information Systems: Data Management Concepts

This module will show you how the ability to recognize what data you want and organizing it in a way that allows you to access and use the data effectively. Read the learning objectives to discover what you will learn in this module.

Learning Objectives

After taking this module, you’ll be able to:

  1. Define data management concepts
  2. Describe components of the data management process

Learning Activities

  • Video: Data Management – Part 1 (4 mins)

    Instructions: Click or tap the video to play.

  • Video: Data Management – Part 2 (4 mins)

    Instructions: Click or tap the video to play.

  • Video: Information Literacy and Data Management (5 min)

    Let’s review the definition of information literacy. Information literacy is the ability to recognize when you need specific information and then access, evaluate, organise and use information from a variety of sources.

    Part of managing data is recognizing what data you need and organising the data in a way that will allow you to use it to meet many different information needs.


    Let’s watch a video that explains why we need to manage data.

    Instructions: Click or tap the video to play.

  • Reading: What is data management? (5 min)

    Data management is the development, execution, and supervision of plans, policies, programs, and practices that control, protect, deliver, and enhance the value of data and information assets.

    Instructions: Click or tap each button to see an example of data management plans, policies, programs, and practices.


    A description of policies and practices that answer key questions about the data in your system: What data exists? How is data collected, stored, accessed, and protected? How will the data most likely be used and by whom?


    Guidelines that are used to protect patient confidentiality and keep data secure.


    A set of activities for routinely assessing and controlling data quality.


    A pop up alert that signals that data has not been entered into a required field.

  • Reading: Phases of Data Management (5 min)

    Let’s break data management down a bit more. There are three main phases of data management: Capture, store, and retrieve.


    When designing and developing these aspects of our data management plan, we must carefully consider what information is needed and how and where it will be used. The quality of our data will greatly influence the quality of our information system - being strategic about how data is managed is crucial!

    Let’s look at what happens during each phase of data management.

  • Reading: Phases of Data Management - Capture (10 min)

    Let’s begin by talking about capturing data. Capture describes three types of activities: collecting, organising, and recording or entering data.


    Do you remember what the definition of data is? Data points are facts, text, or images that have little meaning until they are organised in a way that turns them into information. In order to be useful, the data we manage needs to be captured in a way that preserves their meaning and then effectively turned into information.


    Let’s look at an example of data capture.

    Vital signs are made up of individual numbers, each with their own meaning. When a clinician uses these four data points together, they provide information on the state of a patient’s most basic body functions.

    Knowing who will use this information, how, and when can inform decisions about how this data is collected and recorded. In this next section, we’ll look at collecting data.

  • Reading: Data Collection (10 min)

    If we want information to be used effectively, we must figure out what types of data we will collect and the process for collecting that data. The types of data we collect are qualitative, quantitative, a sound, or an image.

    Instructions: Click or tap each button to take a closer look at each type of data.


    Qualitative data are data that can’t be measured by numbers, for example a person’s occupation. For this reason, we express this type of data with words, which usually indicate a category or group. These categories or groups can be subjectively defined. In our occupation example, we would group color into categories like community health workers, nurses, lab technicians, pharmacists, and medical superintendents. Some people may choose to group occupation into additional categories, like facility-in-charge, hospital administrator, and health officer.


    In addition to qualitative data, we can also collect quantitative data. This type of data captures characteristics that are numerical in nature. This means that they are expressed as numbers and what they represent, for example 4 years of age. Expressing this data as numbers allows you to make calculations and provides greater flexibility when you manipulate the data.

    Some laboratory results are quantitative data, such as CD4 counts or viral load.


    Sound is another valuable type of data. Sometimes, equipment is needed to amplify sounds that are difficult to hear. For instance, a doctor uses a stethoscope to hear a patient’s heart beat. Some sounds require multiple samples to detect changes. We can assign numbers or descriptive terms to turn sound data into something more meaningful: information. Let’s think about a patient’s heart beat. Doctors usually describe heart beat as a frequency: beats per minute. This allows the doctor to use a number to collect sound data in a useful way.


    Images also provide valuable data. Like sounds, different equipment is used to capture an image. For example, an x-ray machine is used to take a chest x-ray of a patient suspected of having TB. Again, numbers or descriptive terms may need to be assigned to make data easier to analyse. When a doctor looks at a chest x-ray to screen a patient for TB, he or she can classify and record the data as “normal” or “abnormal.”


    Regardless of the type of data being collected, there are many different ways to collect the data captured in our information systems. Data from a primary source comes directly from the source. Data from a secondary source means that the data has already been collected.


    Review the list of data sources. Choose or tap the primary sources.

  • Reading: Choosing Data Collection Methods (5 min)

    When designing a health information system and its tools, stakeholders make important decisions about the data collection methods that will be used by health care workers. Remember how evaluating information is a part of being information literate? When making these decisions, we will use the same considerations that we would when evaluating information.

    Instructions: Click or tap on each button to read about what should be considered when choosing a data collection method.


    Will this method give accurate information or information appropriate for the intended use? Is this the information needed? Does it answer my question? For instance, Dr. Mazingo needs to know the number of patients with abnormal chest x-rays. The data collection method uses different categories (light, clear, dark, etc). This data is not appropriate for his needs.


    Is it an authoritative and credible source? Are the data from a primary source or secondary source? If it is secondary source data, are those who collected the data credible? For example, were the patients’ vital signs in the electronic medical record (EMR) directly entered by the nurse who took them? Or did a data clerk transfer them from a paper tool into the EMR system? If the data were transferred from a paper tool, is the clinician or nurse who filled the paper tool credible?


    Will the information be objective? What biases might be introduced by the collection method? Does it provide good evidence for the information or does it appear to be more opinion-oriented? For example, electronic systems use pre-populated drop down menus or tick boxes to ensure that everyone collects data in a standard manner and reduce bias.


    Will the information be current? How long after an event occurred is it being captured? For example, a clinician’s notes made at the time of a visit is more current than abstracting information from medical records once a year for reporting.


    Will the method provide good coverage? Will it ensure completeness?

  • Video: Aggregating and Disaggregating Data (5 min)
    Data Collection Method Limitations
    In practice, the choice of data collection method is limited by financial or cost restraints, time, and human resources. Other factors that influence this choice include:
    • Logistics: Are health care workers able to physically get to the data?
    • Participation: How many people can participate?
    • Question content: What type of information will be collected?

    Instructions: Click or tap the video to play. After you watch the video, respond to the quiz question listed below.

    Question: Ms. Abera, a District Health Information Officer in Katanga, wants to find out which facilities in her district have the highest number of pregnant women who are also HIV positive. What is the best way for Ms. Abera to receive the data?

  • Video: Organising Data (5 min)

    Now let’s move on to the second activity of data capture: organising data. Another word used interchangeably with organisation is collation.

    Instructions: Click or tap the video to play.

  • Reading: Terms Used in Data Collection (5 min)

    Decisions about how to organise the data are made when data collection and reporting forms or software systems are designed. As a user, some of these decisions are already made by stakeholders by the time the health information system software arrives at our post. Here are three terms that we may come across during conversations about how data in a database is organised and stored that make data exchange, retrieval, and use easier. Each one also helps answer the three questions we just discussed.

    Instructions: Click or tap the buttons to learn more about each term.

    Data set

    A collection of records organised for a particular purpose.


    Information or data about data content or structure. Describes how, when, and by whom a particular set of data was collected and how it is formatted.

    • Content metadata describes what is contained within the data. It’s useful for searching and controlling content.
    • Structure metadata tells us how the data containers are designed. Important for faster and more accurate database search and retrieval and for information stored in data warehouses.
    Data dictionary

    Provides definitions for and information about data, such as meaning, relationships to other data, origin, usage, and format.

  • Reading: Recording Data (5 min)

    The last activity of data capture is to record data. Recording refers to the process of ensuring that the data are available for later use. Some common ways include: human memory, pencil and paper, and electronic.

    Instructions: Click or tap each button to read more about recording data.

    Human memory

    Information is preserved for later recall, typically through repetition.

    Pencil and paper

    Information is preserved for later reference by writing it on a piece of paper with a pencil.


    Information is entered by an individual through a user interface into a database for later retrieval.

  • Reading: Electronic Entry (10 min)

    Electronic entry of data is becoming widely used in the health sector as ministries of health and stakeholders invest in software. Electronic entry includes any process for converting data into an electronic format, such as health information system software. While facilities in Namibia may not use the same systems as facilities in Tanzania, each facility integrates electronic entry of data into its workflow in one of three ways: at time of care, at point of care, and after time of care.

    Instructions: Click or tap the 3 method buttons to learn more about it, then click or tap the scenario button to work through an example.

    Time of care

    Data is entered during the patient visit. For example, a patient arrives at clinic reception. The receptionist prints the patient summary or record, and a clinician writes notes on this paper record. After seeing the clinician, the patient (or clinician) takes the record to the data clerk, who updates the electronic record while that patient is still present to resolve issues and reduce errors.

    Point of care

    Data is entered by the individual providing the service. For example, the receptionist verifies the patient contact information, and the clinician enters the clinical information while talking to the patient.

    After time of care

    Data is entered post-visit, typically by a data clerk working through a stack of visit records. The facility maintains a paper system for routine use and an electronic system for reporting or other analysis.

    Dr. Kabandika works at a district hospital. When patients come for a scheduled visit, the receptionist has their file ready. During the visit, the nurse will record the patient’s vital signs on a paper form. When Dr. Kabandika sees the patient, he records his observations and diagnoses on the clinical visit form. Both of these filled forms are added to the patient’s file. At the end of day, files for all patients seen that day are transferred to the Health Records Office. Data clerks then enter the data from the patient’s records into their electronic medical record system.

    Question: Which terms best describes when data is entered into the EMR system?

  • Reading: Verification (5 min)

    An important part of data capture is making sure data are of good quality. One way to do this is through data verification, which is the process of checking to make sure the data captured are correct. This process can be used at two points in time: during data collection and as data are recorded.

    With a paper-based system, we humans play a significant role in verifying that the data we capture are high quality. It can be difficult to check that data are correct and legible every time that we collect and record data. Going back and finding incorrect or missing data at a later time is even more challenging.

    With electronic information systems, the software can relieve us of some of the burden of verifying data at the time of collection and recording. For example, in an electronic laboratory information system a data element is programmed to only accept certain values, such as an age field that only accepts whole numbers below 100. If a user tried to enter 101 or spelled out forty-two instead of entering the numerals, the system would alert the user that the data entered may not be correct. The system performs this task every time data are entered.

  • Video: Phases of Data Management – Store (5 min)

    Once we have collected, collated, and recorded our data, what happens to it? Let’s look at phase two: storing data. There are three activities related to storing data: saving, securing, and archiving.


    Instructions: Click or tap the video to play.

  • Video: Data Security (10 min)

    The next activity under storing is securing data. Patient health data are very personal and valuable so we need to consider security measures as part of data management.

    Instructions: Click or tap the video to play. After you watch the video, read a short scenario and respond to a question below the video.

    Dr. Katungire is the chief laboratory biologist at the Katangan National Public Health Laboratory. In mid-May, while at a national conference, he received a call alerting him that there had been a break in at the laboratory. The main server was among the items stolen—along with the LIS database and all of the data from the last two years. The last backup was at the end of April.

    Question: How will this affect the health care workers and patients? Select all that apply.

  • Video: Archiving Data (3 min)

    The last activity under storing is archiving data.

    Instructions: Click or tap the video to play.

  • Video: Phases of Data Management – Retrieve (5 min)

    This brings us to the final phase of data management: retrieval. How we use the data stored in the database will give us a different reason for retrieving the data we are interested in. For example, a clinician may need to look at specific data, such as a patient’s viral load or CD4 count, to make a treatment decision. Whereas a data manager responsible for data quality will likely need to access the data to make corrections. Let’s look at the activities in this phase: viewing, analysing, correcting, exchanging, and reporting.


    Instructions: Click or tap the video to play.

  • Reading: Summary (5 min)

    We have covered a lot of information in this module. We’ve looked at the three phases of data management: capturing, storing, and retrieving the data we collect. In the capture phase, we can collect different kinds of data and group them by what information we need. We can then organise and record the data. We must also make sure to verify that the data are accurate. In the storing phase, we discussed using databases to save data. We also learned about security threats and how we can protect data. Archiving is the last activity in this phase. In the last phase, retrieval, we saw how healthcare workers retrieve information differently, based on their roles and what questions they are trying to answer. Now go on to the final quiz.

  • Quiz: Post-Test

    After you take the module, please be sure to take the post-test. Read each question and select the correct answer. Please note that the post-test will open in a new website page. You will need access to the Internet to complete the post-test.

    Click to go to the post-test Post-test Module 2