Sunday, March 4, 2012

Reading Text Files

A very common need when programming almost any application is the ability to read data from a text file. Fortunately VB.NET excels at this sort of thing. In the tutorial that follows we will show exactly how you can read through a text file, parse the information, and use it in your Visual Basic application. This can be combined with other articles on this site to handle storing and reading data.

A data file consists of records, which consist of fields. The file that will be used for all examples in this section is a simplified employee file, which consists of the following fields:

*Please note that the data types for these fields are the data types of the variables into which these fields will be stored.  In the text file, all fields will be represented as a string of characters.

Suppose there were five records in the file.  A graphic representation of the file populated with the five data records follows (the field names are not stored in the file):

Employee Name

(Last Name first)

There are two basic ways a file like this would typically come to you. The first way is in fixed-length format, where each field is stored in a fixed position.  On each record, a particular field starts and ends in the same position and occupies the same amount of space. The second way is in delimited format, where the fields within each record are separated from each other by a delimiter character. Commonly used delimiter characters are the tab (ASCII value 9), the "pipe" (vertical bar character "|"), and the comma (","). The comma-delimited file is probably the most commonly used (it is also referred to a "csv" file, where "csv" stands for "comma separated values") – however, there are certain considerations for comma-delimted files that must be taken into account; fortunately VB has an easy way to address these considerations.

The basic way to process both fixed-length files and delimited files is to read the file one line (record) at a time into a string variable, and then use string-handling methods to "break up" the data in the record into separate variables, where each variable represents a field of the record. However, in the case of delimited files, care must be taken to ensure that the data itself does not contain the delimiter character (because if it does, and your program does not handle this, you will wind up with misaligned fields and end-of-file errors). If using the tab (ASCII value 9), or another rarely-used character such as the pipe ("|"), chances are good that the data fields do not contain one of these characters – however, in the case of the comma-delimited file, some data fields may very well contain a comma – in which case the field will be enclosed in quotes. To handle the case of a comma-delimited file that contains fields enclosed in quotes, using native .NET methods, you would have to do extra string parsing to process the data correctly – however, there is a function in the MS.VB namespace that handles this parsing automatically for you, making it relatively easy to process such a file.

We will first look at examples that will process fixed-length files and delimited files using both the System.IO namespace techniques and the Microsoft.VisualBasic techniques. We will then look at an example that can easily handle a challenge brought about by comma-delimited files which contain fields enclosed in quotes.

In this example, for each record (line) of the file, each field starts and ends in the same position and occupies the same amount of space. The content of the file is shown below, along with a column position guide (highlighted in yellow) shown above the records.  From the example, it should be clear that the employee name occupies positions 1 through 20 of each record (note that names shorter than 20 characters are padded with blank spaces); the department number occupies positions 21 through 24 (right-justified and padded left with spaces – so while up to four digits are allowed, each record in this case uses only three digits); the job title occupies positions 30 through 50; the hire date occupies positions 51 through 60; and the hourly rate occupies positions 61 through 65.

image

First, let us introduce the new VB.NET elements that will be used in this example.

The input file ("employee_fixed.txt") will be located in the \bin\Debug subdirectory of your application directory. This is the directory where the executable version of your program is created and stored when you test or run your program from the development environment. This subdirectory can be referenced as follows:

        My.Application.Info.DirectoryPath

To refer to the input file with the full directory path, you concatenate a backslash ("\") along with the filename to the above reference:

        My.Application.Info.DirectoryPath & "\employee_fixed.txt"

(Note: "My.Application.Info.DirectoryPath" would be the equivalent of "App.Path" in pre-.NET versions of VB.) 

The FileStream class is used to create an object that will reference the file that will be processed by the program. The syntax for declaring a FileStream object, as it will be used in the examples that follow, is:

       Dim variable As New FileStream(path, mode, access [,share])

where

is a string that refers to the full path and filename of the file to be processed. As in the example above, you can use a reference like My.Application.Info.DirectoryPath & "\somefile.txt" or any other path/file reference, such as "C:\SomeDirectory\SomeFile.dat".

is an enumeration that specifies how the file should be open or created. The possible values are:

Opens the file if it exists and starts writing new data at the end of the file (preserving what is already there). If the file does not exist, it is created.

Creates a new file; if the file already exists, it is overwritten.

Creates a new file; if the file already exists, an exception is thrown.

Opens an existing file; if the file does not exist, an exception is thrown.

Opens a file if it exists, or creates a new file if it does not exist.

Opens an existing file and truncates it (i.e., deletes any data that was previously there).

is an enumeration that specifies how the file can be accessed. The possible values are:

Data can be read from the file, but not written to it.

Data can be read from and written to the file.

Data can be written to the file, but not read from it.

is an enumeration that specifies restrictions on how other processes can access the file. The possible values are:

Other processes may neither read from nor write to the file.

Any process can read from or write to the file.

Other processes may write to the file.

Other processes may read from the file.

Sample declaration:

Dim strFileName As String = My.Application.Info.DirectoryPath & "\employee_fixed.txt" Dim objFS As New FileStream(strFileName, FileMode.Open, FileAccess.Read)

In the above declaration, the filename was established in a separate string variable, which was then used as the path argument to the FileStream class. The resulting variable, objFS, is a FileStream object that refers to the file.

If desired, you could skip the separate declaration for the filename and write the declaration like this:

      Dim objFS As New FileStream(My.Application.Info.DirectoryPath & "\employee_fixed.txt", FileMode.Open, FileAccess.Read)

The StreamReader class is used to read a stream of characters. In the .NET framework, a stream, in general terms, is the flow of data from one location to another. In the case of these examples, a stream refers to the text file that will be processed by the sample programs. The syntax for declaring a StreamReader object is:

       Dim variable As New StreamReader(stream)

where

is an object representing the stream of characters to be read (in this case a text file).

Sample declaration:

Assuming the declaration of objFS as described above, the declaration could be written as follows: 

      Dim objSR As New StreamReader(objFS)

If desired, you could skip the separate declarations for the filename and FileStream and write the declaration like this:

Dim objSR As New StreamReader( _ New FileStream(My.Application.Info.DirectoryPath & "\employee_fixed.txt", _ FileMode.Open, FileAccess.Read))

Commonly used methods of the StreamReader class are:

Looks ahead to the next available character in the input stream without actually advancing to that next position. If no character is available, Peek returns -1. (This method is convenient to test for end-of-file.)

Reads the next character from the input stream.

Reads the next line of characters from the input stream into a string.

Reads the data from the current position in the input stream to the end of the stream into a string. (This method is often used to read the entire contents of a file.)

Closes the StreamReader and its associated FileStream object.

Code for sample program #1:

The following code reads the fixed-length format file as described above, record by record, and writes a formatted line containing the data from each record on the console.

Imports System.IO  Module Module1   Sub Main()  Dim strFileName AsString = My.Application.Info.DirectoryPath _ & "\employee_fixed.txt" Dim objFS AsNew FileStream(strFileName, FileMode.Open, FileAccess.Read) Dim objSR AsNew StreamReader(objFS)  Dim strEmpRecord AsString Dim strEmpName AsString Dim intDeptNbr AsInteger Dim strJobTitle AsString Dim dtmHireDate AsDate Dim sngHrlyRate AsSingle  Console.WriteLine("The records in the employee_fixed.txt file are:") Console.WriteLine("") Console.WriteLine("EMPLOYEE NAME".PadRight(20) _ & Space(3) _ & "DEPT" _ & Space(3) _ & "JOB TITLE".PadRight(21) _ & Space(3) _ & "HIRE DATE " _ & Space(3) _ & "HRLY RATE") Console.WriteLine("-------------".PadRight(20) _ & Space(3) _ & "----" _ & Space(3) _ & "---------".PadRight(21) _ & Space(3) _ & "--------- " _ & Space(3) _ & "---------")  DoWhile objSR.Peek <> -1 ' read the current record (line) into the variable strEmpRecord strEmpRecord = objSR.ReadLine  ' break up the record into separate variables strEmpName = strEmpRecord.Substring(0, 20) intDeptNbr = CInt(strEmpRecord.Substring(20, 4)) strJobTitle = strEmpRecord.Substring(29, 21) dtmHireDate = CDate(strEmpRecord.Substring(50, 10)) sngHrlyRate = CSng(strEmpRecord.Substring(60, 5))  ' Output the data to the console as a formatted line ... Console.WriteLine(strEmpName _ & Space(3) _ & intDeptNbr.ToString.PadLeft(4) _ & Space(3) _ & strJobTitle _ & Space(3) _ & Format(dtmHireDate, "MM/dd/yyyy") _ & Space(3) _ & Format(sngHrlyRate, "Currency").PadLeft(9)) Loop  objSR.Close()  Console.WriteLine("") Console.WriteLine("Press Enter to close this window.") Console.ReadLine()  EndSub End Module

When the above code is run, the following output is produced:

image

A synopsis of the code follows:

Note that first, your program must import the System.IO namespace. In the Main procedure, the FileStream and StreamReader objects are established, and variables to hold the record data and its constituent fields are declared. Several lines that serve as "headings" are then written to the console. The program then commences with the main processing loop that will process each record in the file, line by line. As long as there is data remaining in the file (as tested with the Peek method), a record is read from the file using the ReadLine method into the strEmpRecord variable. The data from the record is broken up into fields stored in separate variables using the SubString method. A formatted string using these field variables is then written to the console. When the main processing loop ends (i.e., the Peek method detects end-of-file), the StreamReader is closed, and the program ends with our standard message to close the console window.

Download the VB.NET project code for the example above: Read a Fixed Text File

In this example, the fields within each record (line) of the file are delimited by the pipe (|) character. The content of the file is shown below. Note that the fields are "trimmed", as the extra padding needed by the fixed-length format is not needed here.

image

As you will see in the code below, the basic process is the same, except instead of using the SubString method to break up the fields as we had to do to process the fixed-length format file, we will use the Split method to break up the fields for this pipe-delimited file.

Code for sample program #2:

The following code reads the pipe-delimited file as described above, record by record, and writes a formatted line containing the data from each record on the console. (This program produces the exact same output as the previous program.)

Imports System.IO  Module Module1   Sub Main()  Dim strFileName AsString = My.Application.Info.DirectoryPath _ & "\employee_pipe.txt" Dim objFS AsNew FileStream(strFileName, FileMode.Open, FileAccess.Read) Dim objSR AsNew StreamReader(objFS)  Dim strEmpRecord AsString Dim astrEmpFields() AsString  Console.WriteLine("The records in the employee_fixed.txt file are:") Console.WriteLine("") Console.WriteLine("EMPLOYEE NAME".PadRight(20) _ & Space(3) _ & "DEPT" _ & Space(3) _ & "JOB TITLE".PadRight(21) _ & Space(3) _ & "HIRE DATE " _ & Space(3) _ & "HRLY RATE") Console.WriteLine("-------------".PadRight(20) _ & Space(3) _ & "----" _ & Space(3) _ & "---------".PadRight(21) _ & Space(3) _ & "--------- " _ & Space(3) _ & "---------")  DoWhile objSR.Peek <> -1 ' read the current record (line) into the variable strEmpRecord strEmpRecord = objSR.ReadLine  ' break up the record into separate elements of the astrEmpFields array astrEmpFields = strEmpRecord.Split("|")  ' Output the data to the console as a formatted line ... Console.WriteLine(astrEmpFields(0).PadRight(20) _ & Space(3) _ & astrEmpFields(1).PadLeft(4) _ & Space(3) _ & astrEmpFields(2).PadRight(21) _ & Space(3) _ & Format(CDate(astrEmpFields(3)), "MM/dd/yyyy").PadRight(10) _ & Space(3) _ & Format(CSng(astrEmpFields(4)), "Currency").PadLeft(9)) Loop  objSR.Close()  Console.WriteLine("") Console.WriteLine("Press Enter to close this window.") Console.ReadLine()  EndSub End Module

A synopsis of the code follows:

As in the previous program, your program must import the System.IO namespace. In the Main procedure, the FileStream and StreamReader objects are established, and variables to hold the record data and a string array that will contain its constituent fields are declared. Several lines that serve as "headings" are then written to the console. The program then commences with the main processing loop that will process each record in the file, line by line. As long as there is data remaining in the file (as tested with the Peek method), a record is read from the file using the ReadLine method into the strEmpRecord variable. The data from the record is broken up into fields using the Split method, which will cause the data from the fields to be stored in elements of the astrEmpFields array. A formatted string using the array data is then written to the console. When the main processing loop ends (i.e., the Peek method detects end-of-file), the StreamReader is closed, and the program ends with our standard message to close the console window.

Download the VB.NET sample code for this example: Reading a pipe delimited file

In this example, the fields within each record (line) of the file are delimited by a comma (,). However, since the employee name field itself contains a comma, it is also enclosed in quotes. The content of the file is shown below.

image

The intention of the quotes is to tell the program that is to process this file to treat the data between the quotes as a single field, thus overriding the function of the comma in that case. If we had only the methods described thus far available to us, we would have to perform extra string parsing on the record after it had been read in with the ReadLine method of the StreamReader object – we could not simply use the Split method using the comma as the delimiter. Doing that would cause us to get an extra data item that we did not expect (i.e., two fields for the employee name instead of one), and furthermore, the quotes would be stored as part of the fields, which we also would not want. Fortunately, we can use the Input function available in the Microsoft.VisualBasic namespace to handle exactly this type of situation, as that is what it was designed to do. The MS.VB Input function is a retooled version of the Input statement that was available not only in classic VB, but prior to that in old versions of BASIC such as QBASIC and GW-BASIC.

In this example, we will also use some of the other Microsoft.VisualBasic namespace functions. The MS.VB namespace functions are summarized below, followed by more detailed explanations.

Returns an Integer value representing the next file number available for use by the FileOpen function. (Analagous to the FreeFile statement in classic VB.)  (Note: When using the MS.VB namespace functions, a file is referenced by a number rather than an object.)

Opens a file for input or output. (Analagous to the Open statement in classic VB.)

Reads a comma-delimited field from a text file and assigns it to a variable. (Analagous to the Input statement in classic VB.)

Reads a single line from an open sequential file and assigns it to a String variable. (Analagous to the LineInput statement in classic VB; also analagous to the ReadLine method of the StreamReader object.)

Reads a specified number of characters from a text or binary file into a String variable. (Analagous to the Input function in classic VB; similar to the ReadToEnd method of the StreamReader object.)

Returns a Boolean value indicating whether or not the end of a file opened for input or random access has been reached. (Analagous to the EOF function in classic VB.)

Writes data to a text file. Automatically writes the data in comma-delimited format, enclosing string fields in quotes when necessary. (Analagous to the Write statement in classic VB.) These functions will be explored in more detail in the section on Processing Text Files for Output.

Writes data to a text file.  (Analagous to the Print statement in classic VB and somewhat similar to the WriteLine method of the StreamWriter object, but with additional functionality.) These functions will be explored in more detail in the section on Processing Text Files for Output.

Closes an open file. (Analagous to the Close statement in classic VB.)

We will now explore some of these functions in more detail.

FileOpen

The FileOpen function prepares a file to be processed in the VB program.  It identifies the Windows-system file that will be used in the program and assigns the file a file number that will be used to reference that file for the remainder of the program. 

Syntax:

       FileOpen(FileNumber, FileName, Mode [, Access [, Share [, RecordLength]]])

The parameters for FileOpen are as follows:

FileNumber :                 

Required. Any valid file number. The FreeFile function can be used to obtain the next available file number.

FileName :                    

Required. String expression that specifies a valid file name — may include directory or folder, and drive.

Mode :                         

Required. Enum specifying the file mode. Possible values are:

Opens a file for output (writing). If the file does not exist, it will be created; if it does exist, records will be added to the file after the last record in the file (the previous contents of the file will not be overwritten).

Not applicable for text file processing; will be discussed in the section on binary file processing.

Opens a file for input (reading). The specified file must exist.

Opens a file for output (writing). If it does not exist, it will be created; if it does exist, its previous contents will be overwritten. 

Not applicable for text file processing; will be discussed in the section on random file processing.

Input and LineInput functions may only be used on files opened in the Input mode; Write, WriteLine, Print, and PrintLine may only be used on files opened in the Output or Append modes.

Access :                       

Optional. Enum specifying the operations permitted on the open file. Possible values are:

OpenAccess. Default

or

OpenAccess. ReadWrite

The file may be read from or written to.

The file may be read from, but not written to.

The file may be written to, but not read from.

Share :

Optional. Enum specifying the operations restricted on the open file by other processes. Possible values are:

OpenShare.Default

or

OpenShare.LockReadWrite

Other processes may neither read from nor write to the file.

Any process can read from or write to the file.

Other processes may not read from the file.

Other processes may not write to the file.

RecordLength:

Optional. Number less than or equal to 32,767 (bytes). For files opened for random access, this value is the record length. For sequential files, this value is the number of characters buffered.

Example:

      FileOpen(1, "C:\Program Files\EmpMaint\EMPLOYEE.TXT", OpenMode.Input)

FreeFile

Instead of hard-coding the file number, you can use the FreeFile function to supply you with a file number that is not already in use by the system.  The FreeFile function takes no arguments and returns an integer.  To use it, declare an integer variable, then assign FreeFile to it, as follows:

Dim intEmpFileNbr As Integer intEmpFileNbr = FreeFile

In the Open statement (and any other statement that refers to this file), use the integer variable rather than the hard-coded number.  For example:

      FileOpen(intEmpFileNbr, "C:\Program Files\EmpMaint\EMPLOYEE.TXT", OpenMode.Input)Input

The Input function reads a fields from a comma-delimited text file and stores the contents of that field into the specified variable.  The general format is:

· filenumber refers to the file that was opened using that number in the FileOpen function

· variable is a variable into which the "next" data field from the file will be stored

Note: In previous versions of VB and BASIC, the "Input #" statement syntax allowed you to specify a "variable list" where you could read any number of data items from the file into corresponding variables. Generally, you would use this to read one "record's worth" of data into specified variables using just one Input # statement (for example, if a record had five fields, your Input # statement would specify a list of five variables). In VB.NET, the "variable list" format is not supported for the Input function – so for the scenario with a record containing five fields, you would simply code five separate Input functions. An example is given below.

Recall the comma-delimited version of the employee file shown earlier:

image

Assume you declare the following variables in your program:

Dim strEmpName AsString Dim intDeptNbr AsInteger Dim strJobTitle AsString Dim dtmHireDate AsDate Dim sngHrlyRate AsSingle

the set of statements

Input(intEmpFileNbr, strEmpName) Input(intEmpFileNbr, intDeptNbr) Input(intEmpFileNbr, strJobTitle) Input(intEmpFileNbr, dtmHireDate) Input(intEmpFileNbr, sngHrlyRate)

would cause ANDERSON,ANDY to be stored in strEmpName, 100 to be stored in intDeptNbr, PROGRAMMER to be stored in strJobTitle, 3/4/1997 to be stored in dtmHireDate, and 25 to be stored in sngHrlyRate the first time that the statement was executed. 

The second time this set of statements is executed, BABCOCK, BILLY, 110, SYSTEMS ANALYST, 2/16/1996, and 33.5 would be stored respectively in strEmpName, intDeptNbr, strJobTitle, dtmHireDate, sngHrlyRate; and so on. 

As the program reads each field into its respective variable, conversion to the correct data type (Integer, Date, Single, etc.) is automatically performed.

EOF Function

The operating system automatically appends a special character, called the end-of-file marker, to the end of a sequential file.  VB can sense the presence of this end-of-file marker with the EOF function.

A programming language will generally recognize EOF at either one of two times: (1) after the last record has been read – OR – (2) at the same time that the last record has been read.  COBOL falls into the first category, VB falls into the second.

In a language that recognizes EOF after the last record in the file has been read (such as COBOL), the "input" or "read" loop is set up similar like a prompted dialog loop: with a priming readoutside the loop; all subsequent reads occur at the bottom of the loop.  The pseudocode might be written as follows:

READ (first) RECORD DO UNTIL EOF PROCESS THE RECORD READ (next) RECORD LOOP

In a language that recognizes EOF when the last record is read (such as VB), the "input" or "read" loop must be modified so that there is NO PRIMING READ and the read occurs as the FIRST statement in the body of the processing loop. The pseudocode might be written as follows:

DO UNTIL EOF READ A RECORD PROCESS THE RECORD LOOP

The syntax of the EOF function is EOF(n) where n is a number corresponding to the file number of the file from which you want to read data. n can either be a hard-coded number or an integer variable, depending on whether or not you used FreeFile with the FileOpen function.

The EOF function can be used anywhere that a conditional expression can be used; as such, it must always follow keywords such as UNTIL, WHILE, and IF.  The EOF function can also be preceded by the keyword NOT: for example, Do Until EOF(1) is equivalent to Do While Not EOF(1).

The main loop to process the employee file might look like this (note that there is no "priming" read and that the input is done at the top of the loop):

Do Until EOF(intEmpFileNbr)Input(intEmpFileNbr, strEmpName)Input(intEmpFileNbr, intDeptNbr)Input(intEmpFileNbr, strJobTitle)Input(intEmpFileNbr, dtmHireDate)Input(intEmpFileNbr, sngHrlyRate)' Processing for the record would go here – for example, load some of these ' fields into an element of an array or list box, print a line of a report, etc...LoopFileClose

When you are finished using a file in your program, you should close that file.  The FileClose function concludes input/output (I/O) to one or more files opened using the FileOpen and frees up the system resources needed to process those files.

Syntax:

where FileNumbers is a comma-delimited list of zero or more file numbers representing the files to be closed. If no file numbers are specified, all open files will be closed.

The statement

            FileClose(1)

frees the resources used by the file referenced as number 1, and also terminates the association between the Windows-system file and the file number – so at this point, if you wanted to, you could use FileOpen to open a different file using 1 as the file number.

If you have more than one file open in a program, you can close multiple files with one FileClose function by separating the file numbers with commas:

            FileClose(1, 2, 68)

The statement

            FileClose()

with no file numbers specified closes all files that are currently open.

Code for sample program #3:

The following code reads the comma-delimited file as described above, in groups of five fields at a time (one "record's worth" of data), and writes a formatted line containing the data from each record on the console. (This program produces the exact same output as the previous program exapmles.)

Module Module1  Sub Main()  Dim strFileName AsString = My.Application.Info.DirectoryPath _ & "\employee_comma.txt" Dim intEmpFileNbr AsInteger Dim strEmpName AsString Dim intDeptNbr AsInteger Dim strJobTitle AsString Dim dtmHireDate AsDate Dim sngHrlyRate AsSingle Console.WriteLine("The records in the employee_fixed.txt file are:") Console.WriteLine("") Console.WriteLine("EMPLOYEE NAME".PadRight(20) _ & Space(3) _ & "DEPT" _ & Space(3) _ & "JOB TITLE".PadRight(21) _ & Space(3) _ & "HIRE DATE " _ & Space(3) _ & "HRLY RATE") Console.WriteLine("-------------".PadRight(20) _ & Space(3) _ & "----" _ & Space(3) _ & "---------".PadRight(21) _ & Space(3) _ & "--------- " _ & Space(3) _ & "---------")  intEmpFileNbr = FreeFile()  FileOpen(intEmpFileNbr, strFileName, OpenMode.Input)  DoUntil EOF(intEmpFileNbr)  ' Read one "record's worth" of fields into their ' corresponding variables Input(intEmpFileNbr, strEmpName) Input(intEmpFileNbr, intDeptNbr) Input(intEmpFileNbr, strJobTitle) Input(intEmpFileNbr, dtmHireDate) Input(intEmpFileNbr, sngHrlyRate)  ' Output the data to the console as a formatted line ... Console.WriteLine(strEmpName.PadRight(20) _ & Space(3) _ & intDeptNbr.ToString.PadLeft(4) _ & Space(3) _ & strJobTitle.PadRight(21) _ & Space(3) _ & Format(dtmHireDate, "MM/dd/yyyy").PadRight(10) _ & Space(3) _ & Format(sngHrlyRate, "Currency").PadLeft(9)) Loop FileClose(intEmpFileNbr) Console.WriteLine("") Console.WriteLine("Press Enter to close this window.") Console.ReadLine()  EndSub End Module

A synopsis of the code follows:

Because this example uses the Microsoft.VisualBasic namespace functions only, the System.IO namespace is not required. In the Main procedure, variables to establish the filename, file number, and data fields are declared. Several lines that serve as "headings" are then written to the console. The file is then opened, and the program then commences with the main processing loop that will process data in the file in groups of five fields (one "record's worth" of data) at a time. As long as there is data remaining in the file (as tested with the EOF function), the data fields are read in with the Input function. A formatted string containing the data fields is then written to the console. When the main processing loop ends (i.e., the EOF function detects end-of-file), the file is closed, and the program ends with our standard message to close the console window.

Download the VB.NET project code for this example: Read Comma Delimited Text File Example

As in the example above, the fields within each record (line) of the file are delimited by a comma (,), and, since the employee name field itself contains a comma, it is also enclosed in quotes. The content of the file is shown below.

image

The intention of the quotes is to tell the program that is to process this file to treat the data between the quotes as a single field, thus overriding the function of the comma in that case. As shown above, we can use the "classic" method of using the retooled Input statement, but as an alternative, we can use the FileIO.TextFieldParser class, introduced in VB 2005. The FileIO.TextFieldParser class is actually a member of the Microsoft.VisualBasic namespace, so we need not import System.IO for this example.

Code for sample program #4:

The following code reads the comma-delimited file as described above and writes a formatted line containing the data from each record on the console. (This program produces the exact same output as the previous program exapmles.)

Module Module1   Sub Main()  Dim strFileName AsString = My.Application.Info.DirectoryPath _ & "\employee_comma.txt" Dim objTFParser As FileIO.TextFieldParser Dim astrTFFields() AsString  Console.WriteLine("The records in the employee_comma.txt file are:") Console.WriteLine("") Console.WriteLine("EMPLOYEE NAME".PadRight(20) _ & Space(3) _ & "DEPT" _ & Space(3) _ & "JOB TITLE".PadRight(21) _ & Space(3) _ & "HIRE DATE " _ & Space(3) _ & "HRLY RATE") Console.WriteLine("-------------".PadRight(20) _ & Space(3) _ & "----" _ & Space(3) _ & "---------".PadRight(21) _ & Space(3) _ & "--------- " _ & Space(3) _ & "---------")  objTFParser = New FileIO.TextFieldParser(strFileName) objTFParser.TextFieldType = FileIO.FieldType.Delimited objTFParser.SetDelimiters(",") objTFParser.HasFieldsEnclosedInQuotes = True  DoUntil objTFParser.EndOfData ' Read one "record's worth" of fields into the ' "astrTFFields" array ... astrTFFields = objTFParser.ReadFields  ' Output the data to the console as a formatted line ... Console.WriteLine(astrTFFields(0).PadRight(20) _ & Space(3) _ & astrTFFields(1).ToString.PadLeft(4) _ & Space(3) _ & astrTFFields(2).PadRight(21) _ & Space(3) _ & Format(CDate(astrTFFields(3)), "MM/dd/yyyy").PadRight(10) _ & Space(3) _ & Format(CSng(astrTFFields(4)), "Currency").PadLeft(9)) Loop objTFParser.Close()  Console.WriteLine("") Console.WriteLine("Press Enter to close this window.") Console.ReadLine()  EndSub End Module

A synopsis of the code follows:

Note that in addition to declaring a string variable for the filename, the variable objTFParser is declared as a FileIO.TextFieldParser object, and a string array variable astrTFFields() is declared (this array will hold the content of each record, with each element representing a field within the record). Several lines that serve as "headings" are then written to the console. The objTFParser variable is then set (with the statement objTFParser = New FileIO.TextFieldParser(strFileName)), which also serves to open the file. In the statements that follow, we also tell objTFParser that it is a delimited file, that it is delimited by commas, and that it has fields enclosed in quotes.The program then commences with the main processing loop that will process each record in the file, line by line. As long as there is data remaining in the file (as tested with the EndOfData method), a record is read from the file using the ReadFields method into the astrTFFields array variable. The ReadFields method acts in manner similar the Split method, which will cause the data from the fields to be stored in elements of the astrTFFields array. A formatted string using these field variables is then written to the console. When the main processing loop ends (i.e., EndOfData is True), the TextFieldParser is closed, and the program ends with our standard message to close the console window.

Download teh VB project code for this example: VB.NET Text Field Parser Example


View the original article here

No comments:

Post a Comment