top of page

Microform digitization service: Microfilms & Microfiches

Microfilm is a type of microphotographic film, used in the archiving and data retention industry. This type of media stores documents that are reduced by 24 or 36 times (or even more) to generate micro-scale copies of the original documents. It is part of the microform family, alongside microfiche. The main advantage of microfilm is its lifespan, which is estimated at more than 500 years. It was in 1936 that the American Library began approving microfilm and it was there that the technology began to develop. A reel (height is 16 mm, 35 mm or 105 mm, the length of the microfilm is variable but it is generally 30.5 m or 66 m) can contain up to 2500 documents. This makes it easy to store tens of millions of archive pages in small spaces. But microfilm remains a challenge for daily use (research, consultation, duplication, sharing, etc.).

Microfiche uses the same technology as microfilm. It is generally available in sheets of 148 x 105 mm.

Digitization of Microfiches 


For data:

Alto XML: An Alto XML file is a file format used primarily in the field of digitization and transcription of textual documents, particularly in the field of library science and archives management. Alto (Analyzed Layout and Text Object) is an XML format standardized by the Library of Congress (LoC) and the German National Library to represent data extracted from digitized documents. It is designed to represent information about the layout and textual content of documents in a structured manner, facilitating the search, manipulation, and automated analysis of these documents. An Alto XML file contains tags that describe the spatial arrangement of text on a page, including information about text blocks, lines, words, and characters. It can also include metadata about the document, such as title, author, publication date, etc. This format is widely used in library and archive digitization projects to store and exchange digitized text data in a standardized and interoperable way.

CSV & XLSX Format: Comma-Separated Values ​​(CSV) and Excel Open XML Spreadsheet (XLSX) are two commonly used file formats for storing tabular data, but they have significant differences:

File Structure:
CSV: A CSV file is a plain text file where the data is organized in a table format, with the values ​​separated by commas (or other delimiters, such as semicolons or tabs). It does not support formatting, formulas, or multiple worksheets.
XLSX: An XLSX file is an XML-based binary file used by Microsoft Excel. It can contain multiple worksheets, formulas, charts, advanced cell formatting, images, and more. It preserves the structure of the data, as well as additional metadata such as cell formats and macros.

 

Software Compatibility:


CSV: Being a universal plain text format, CSV files can be opened and read by a wide variety of software, including simple text editors and spreadsheet applications.
XLSX: XLSX files are specifically designed for use with Microsoft Excel and typically require compatible software, such as Microsoft Excel, LibreOffice Calc, Google Sheets, etc.

 

File Size:


CSV: CSV files tend to be more compact than XLSX files because they are stored as plain text and typically do not contain additional formatting information.
XLSX: XLSX files can be larger than CSV files due to their binary structure and the ability to include advanced features such as formulas, charts, and images.

Data Processing:


CSV: CSV files are easier to process and manipulate in programming because they are based on plain text and do not require specialized software to read them.
XLSX: XLSX files can be more complex to manipulate in programming because of their binary structure and the need to use specific libraries or APIs to read and write data.

XML Format: XML, or eXtensible Markup Language, is a markup language used to store and exchange data in a human- and machine-readable manner. It was designed to be extensible and adaptable to a wide variety of data representation needs. Here are some key features of XML:

Data Structuring: XML allows data to be structured using user-defined tags. Tags are used to mark data elements and attributes, which helps organize data hierarchically.

Human Readability: XML documents are typically written in a human-readable format, making it easy to understand and manually modify the data without the need for specialized tools.

Extensibility: XML is extensible, meaning that users can define their own tags and data structures to meet their specific needs. This makes it suitable for a wide variety of applications and use cases.

Interoperability: XML is widely used in computer systems to exchange data between different applications and platforms. Its simple structure and text format make it easily readable and interpretable by computer systems, which promotes data interoperability.


​Metadata Support: XML is often used to store metadata associated with documents or data elements. Tags can be used to describe information such as title, author, date, etc.

Processing with Specific Languages: XML is often used in conjunction with other programming languages, such as XPath, XSLT, and XML Schema, which provide functionality to search, transform, and validate XML documents.

bottom of page