Eryndlia's Prog Blog: Comma-Separated Values, final (demo_csvs-process_csv_value

Introduction
This final posting of Comma-Separated Values processing code is a how-to procedure, which, along with the main program Demo_CSVs described in the first part of this blog series, demonstrates that the code works as expected. It is not intended to be a full white-box-style test but just an example of how to use the program.

Process_CSV_Value_Array, procedure body
This procedure is an example of a user-defined subprogram to process a single line of actual Values — not the header strings from the first CSV text file record — although, if these headers exist, they will be used to index the array. The main program Demo_CSVs, along with the package CSV_Types and this procedure, define not only the type of CSV file to be read but also the headers/titles, if any, that are to be expected in the first file record. In this case, the code as been set up to expect a CSV header record.

In normal use procedure Demo_CSVs.Process_CSV_Value_Array would take the array of values passed to it and interpret the strings in terms of a specific domain. Example domains might be real number map coordinates; a combination of dollar and text values from a financial spreadsheet; racing track bets; pantry contents; and so on ad infinitum. Here, Process_CSV_Value_Array merely displays each value in the Values array and writes it to the Standard output file, along with the header for that value and some formatting characters.

Let's look at the code:

procedure body Demo_CSVs.Process_CSV_Value_Array

The single parameter for this procedure is The_Text_Values, which is defined to be the same type as that of the array Values maintained in the protected type defined in each generic package.

As usual we have with clauses to indicate the other library units — all packages in this case — that are needed for this subprogram to compile. Additionally, two use clauses assist in making the code readable and shorter. One is for the package CSV_Types.Employee_CSV and makes everything defined there directly visible, so that the references to items in that package do not have to be qualified. Similarly for the use clause for Ada.Strings.Fixed (within the loop).

Only two objects are defined locally:

Input_Buffer
Length_Read

Both of these variables are used to help regulate the display of these values by forcing the user to press the key after a Values array has been fully displayed.

The first action the program takes is to display a line indicating that the values for a new record follow. Next comes the loop Display_The_Record_Values. This loop is indexed by the headers/titles defined in the program and whose String values were read from the first CSV record.

The declare block Build_Output_Line, which comes next, constructs a constant String object from the currently selected Header and the value from the The_Text_Values array indexed by that Header. Note that the order of the displayed values is not necessarily the order that they occur within the CSV file record. The sequence selected by the loop expression is that of the originally-defined Headers_Type, which, in this case, is that within the package Employee_CSV. The String thus created is then output.

When the loop exits after the last value in the array has been processed, the only thing left to do is to await the user's go-ahead (pressing ) to continue reading and processing the next file record.

Summary
In this series of four posts, we have seen the composition of a general-purpose processor for Comma-Separated Values (CSV) files. This program is designed:

To process a CSV file that may or may not have a header record. Even though there are two generic packages written for these two scenarios, it is not terribly difficult to write a single package to handle both situations on-the-fly automatically.
So that any combination of text characters can be used as the "comma", that is, the break between values.
So that code may be reused through the use of generic packages.

Along the way, we have used some interesting features of the Ada language including anonymous callback procedures; protected types, which by the way have some features of both a package and a task; generic units for parameterized reuse; exception handlers; nested subprograms and program blocks with declarative regions; a separately-compiled unit; some of the predefined, standard, library packages including Ada.Text_IO, Ada.Strings.Fixed, and Ada.Strings.Maps; and some new features being put into the Ada 2012 standard. The program shown could be optimized a fair amount in terms of simplicity — merging the two generic packages, for instance — and in terms of memory and CPU efficiency. I have my own ideas, and I would be curious to hear others'.

Please feel free to make suggestions or ask questions.

-- em

Pages

Saturday, October 15, 2011

Comma-Separated Values, final (demo_csvs-process_csv_value_array.adb)

No comments:

Post a Comment