Litigation Support Technical Standards
by Mark Lieb



Sample Content
  • Table of Contents
  • Introduction
  • For Vendors
  • For Firms


  • Business Standards
  • Cost Codes
  • Outgoing Media Kit
  • RFQs
  • Quotes


  • Technical Standards
  • Media Labels
  • Bates Schemes
  • Native Files
  • File-Folder Names


  • Downloads
  • The Standard
  • The Book


  • Software Load Files
  • CaseSoft
  • IPRO
  • To Be Added


  • What Not To Do
  • Media Labels
  • Load Files
  • Transcripts
  • General Errors


  • More Resources
  • LSVA
  • Litigation Support
  • Ad Litem Consulting


  • Mark Lieb
    Ad Litem Consulting



    litgation





    Home | TOC | Previous | Next | Download


    3.12 OCR

    Vendor should use auto-rotate and voting when generating OCR. Most OCR software offers an auto-rotate option. When auto-rotate is enabled, the software will OCR each image four times, rotated 90 degrees each time. It determines the best result and publishes the content to the load file. The majority of documents have the same orientation: portrait. Without auto-rotate, these documents can yield good results. The rest of the documents may be designed for a landscape layout, such as an HR chart. Other documents still may have been scanned “upside-down”, resulting in garbage OCR. OCR voting is a process where multiple OCR programs compare results to determine the best results.

     

     

    Quality Check

    The OCR text should best approximate and recreate the formatting found on the original image. The OCR field should never be just the words in one long string.

     

    No text and the top, bottom or either side should be clipped.

     

    Multi-Page Text Files

    There should be a one document to one OCR text file ratio. The OCR filename must match the document image key. So, a 10 page document with the image key of AA001 should have a corresponding file AA001.TXT that contains the OCR for AA001 through AA010.

     

    Each page of OCR should have a line identifying the page number, or Bates number. In this fashion, people can search for any Bates number and find the correct document. Please include space between the OCR text and page marker.

     

    The following shows sample OCR:

     

    << AA001 >>

     

    Text for first page

     

    << AA002 >>

     

    Text for second page

     

    The following chart shows a sample database and corresponding OCR files:

     

    IMAGE KEY

    BEGBATES

    ENDBATES

    PATH

    FILENAME

    AA001

    AA001

    AA0010

    D:\[VOLUME NAME]\OCR\

    AA001.TXT

    AA011

    AA011

    AA0011

    D:\[VOLUME NAME]\OCR\

    AA011.TXT

    AA012

    AA012

    AA0038

    D:\[VOLUME NAME]\OCR\

    AA012.TXT

    AA039.0001*

    AA039.0001

    AA0100

    D:\[VOLUME NAME]\OCR\

    AA039.0001.TXT

    * Please refer to Bates prefix and suffix conventions.

     


    Home | TOC | Previous | Next | Download


    Contact Ad Litem

    (C)2005 Ad Litem Consulting, Inc.









    About Litigation Support Technical Standards

    This document was initially designed to eliminate any discrepancy between firm technical needs and how the vendor created the technical aspect of their products. Litigation Support spends needless hours changing the vendor delivery. The firm pays for product that litigation support will have to modify. Today, the document covers as many technical requirements as possible for as many types of discovery and software as possible.

    To get a good idea of the reason for these explicit directions, please visit the final section of this document entitled, “Things not to do”. All of these examples are from real life. All of these examples caused headaches, delaying reviews, productions and more.

    I hope that this document is helpful to you.

























    Template by Steves Templates